Benyou Wang is an assistant professor in the School of Data Science, The Chinese University of Hong Kong, Shenzhen. He has achieved several notable awards, including the Best Paper Nomination Award in SIGIR 2017, Best Explainable NLP Paper in NAACL 2019, Best Paper in NLPCC 2022, Marie Curie Fellowship, Huawei Spark Award. His primary focus is on large language models.
A final project poster session is planned by the end of the course (tentatively Dec. 15th 2023). This is to provide students the opportunities to present their wonderful work.
Anyone interested in LLMs are welcome to join. More details will be provided when it is close to the event. Feel free to reach out!
The course will introduce the key concepts in LLMs in terms of training, deployment, downstream applications. In the technical level, it covers language model, architecture engineering, prompt engineering, retrieval, reasoning, multimodality, tools, alignment and evaluations. This course will form a sound basis for further use of LLMs. In particular, the topics include:
Recommended Books:
We will have a review for project proposals, to assist students better prepare their final projects. A revision is welcome after taking our suggestions into consideration.
The project could be done by a group but each indivisual is separately evaluated. You need to write a project report (max 6 pages) for the final project. Here is the report template. You are also expected to make a project poster presentation. After the final project deadline, feel free to make your project open source; we appreciate if you acknowledge this course
Here are some ways to earn the participation credit, which is capped at 5%.
The penalty is 0.5% off the final course grade for each late day.
Date | Topics | Recommended Reading | Pre-Lecture Questions | Lecture Note | Coding | Events Deadlines | Feedback Providers |
---|---|---|---|---|---|---|---|
Sep. 4-15th self-study; do not come to the classroom | Tutorial 0: GitHub, LaTeX, Colab, and ChatGPT API |
OpenAI's blog LaTeX and Overleaf Colab GitHub |
Benyou Wang | ||||
Sep. 15th | Lecture 1: Introduction to Large Language Models (LLMs) |
On the Opportunities and Risks of Foundation Models Sparks of Artificial General Intelligence: Early experiments with GPT-4 |
What is ChatGPT and how to use it? | [slide] | Xidong Wang | ||
Sep. 22nd | Lecture 2: Language models and beyond |
A Neural Probabilistic Language Model BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Training language models to follow instructions with human feedback |
What is language model and why is it important? | [slide] | Juhao Liang | ||
Oct. 8th | Lecture 3: Architecture engineering and scaling law: Transformer and beyond |
Attention Is All You Need HuggingFace's course on Transformers Scaling Laws for Neural Language Models The Transformer Family Version 2.0 On Position Embeddings in BERT |
Why does Transformer become the backbone of LLMs? | [slide] | [nanoGPT] | Xidong Wang | |
Oct. 13th | Tutorial 1: Usage of OpenAI API and Assignment 1 |
OpenAI's blog |
How to automatically use ChatGPT in a batch? | [slide] | [Using ChatGPT API] | Assignment 1 out | Xidong Wang |
Oct. 20th | Lecture 4: Training LLMs from scratch |
Training language models to follow instructions with human feedback LLaMA: Open and Efficient Foundation Language Models Llama 2: Open Foundation and Fine-Tuned Chat Models |
How to train LLMs from scratch? | [slide] | [LLMZoo], [LLMFactory] | Juhao Liang | |
Oct. 20th | Tutorial 2: train your own LLMs and assignment 2 | Are you ready to train your own LLMs? | [LLMZoo], [nanoGPT], [LLMFactory] | Juhao Liang | |||
Oct. 27th | Lecture 5: Efficiency in LLMs |
Efficient Transformers: A Survey FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity Towards a Unified View of Parameter-Efficient Transfer Learning |
How to make LLMs train/inference faster? | [slide] | [llama2.c] | Assignment 1 due (Oct. 31, 11:59pm) | Zhengyang Tang |
Nov. 3rd | Lecture 6: Mid review of final project | N/A | N/A | [slide] | Assignment 2 out |
Benyou Wang | |
Nov. 3rd | Tutorial 3: preparing your own project | Any ideas to train a unique LLM to solve problems in your own research field? | Junyin Chen | ||||
Nov. 10th | Lecture 7: Knowledge, Reasoning, and Prompt engineering |
Natural Language Reasoning, A Survey and others Best practices for prompt engineering with OpenAI API prompt engineering |
Can LLMs reason? how to better prompt LLMs? | [slide] | Shuo Yan and Fei Yu | ||
Nov. 17th | Lecture 8: Multimodal LLMs | CLIP, MiniGPT-4, Stable Diffusion and others | Can LLMs see? | [slide] | Assignment 2 due (11:59pm) | Junying Chen | |
Nov. 24th | Lecture 9: LLM agent |
ToolBench AgentBench Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks LLM Powered Autonomous Agents |
Can LLMs plan? | [slide] | Juhao Liang | ||
Nov. 24th | Tutorial 4: Improving your LLM projects (Personal Discussion) | How to improve your LLM applications? | Project proposal due (11:59pm) | Benyou Wang, Juhao Liang and Xidong Wang | |||
Dec. 1st | Lecture 10: LLMs in vertical domains | Large Language Models Encode Clinical Knowledge, Capabilities of GPT-4 on Medical Challenge Problems, Performance of ChatGPT on USMLE, Medical-NLP, ChatLaw | Can LLMs be mature experts like doctors/lawyers? | [slide] | [HuatuoGPT] | Junying Chen | |
Dec. 8th | Lecture 11: Alignment, Limitations, and broader Impact |
Superalignment GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks Theory of Mind Might Have Spontaneously Emerged in Large Language Models On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Survey of Hallucination in Natural Language Generation Extracting Training Data from Large Language Models |
What are LLMs' limitations? | Juhao Liang | |||
TBD | Guest lectures | N/A | Benyou Wang | ||||
Dec. 15 | Lecture 12: In-class presentation (extended class) | N/A | How to solve real-world problems using LLMs | Zhengyang Tang, Juhao Liang and Xidong Wang |
We borrowed some concepts and the website template from [CSC3160/MDS6002] where Prof. Zhizheng Wu is the instructor.
Website github repo is [here] .