Benyou Wang is an assistant professor in the School of Data Science, The Chinese University of Hong Kong, Shenzhen. He has achieved several notable awards, including the Best Paper Nomination Award in SIGIR 2017, Best Explainable NLP Paper in NAACL 2019, Best Paper in NLPCC 2022, Marie Curie Fellowship, Huawei Spark Award. His primary focus is on large language models.
Below you can find course websites from previous years. Our course content and assignments will change from year to year; please do not do assignments from previous years.
The course will introduce the key concepts in LLMs in terms of training, deployment, downstream applications. In the technical level, it covers language model, architecture engineering, prompt engineering, retrieval, reasoning, multimodality, tools, alignment and evaluations. This course will form a sound basis for further use of LLMs. In particular, the topics include:
Date | Topics | Recommended Reading | Pre-Lecture Questions | Lecture Note | Coding | Events Deadlines | Feedback Administrators |
---|---|---|---|---|---|---|---|
Sep. 6-17th self-study; do not come to the classroom | Tutorial 0: GitHub, LaTeX, Colab, and ChatGPT API |
OpenAI's blog LaTeX and Overleaf Colab GitHub |
Benyou Wang | ||||
Sep. 6th | Lecture 1: Introduction to Large Language Models (LLMs) |
On the Opportunities and Risks of Foundation
Models Sparks of Artificial General Intelligence: Early experiments with GPT-4 |
What is ChatGPT and how to use it? | [slide] | Junying Chen | ||
Sep. 13th | Lecture 2: Language models and beyond |
A Neural Probabilistic Language
Model BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Training language models to follow instructions with human feedback |
What is language model and why is it important? | [slide] | Ke Ji | ||
Sep. 13th | Tutorial 1: Prompt Engineering |
OpenAI's
blog |
The Guide to LLM Prompt Engineering | [slide] | [Tutorial Code] [Assignment1] | Assignment 1 release | Junying Chen |
Sep. 20th | Lecture 3: Architecture engineering and scaling law: Transformer and beyond |
Attention Is All You Need HuggingFace's course on Transformers Scaling Laws for Neural Language Models The Transformer Family Version 2.0 On Position Embeddings in BERT |
Why does Transformer become the backbone of LLMs? | [slide] | [nanoGPT] | Junying Chen | |
Sep. 27th | Lecture 4: Training LLMs from scratch |
Training language models to follow instructions with
human feedback LLaMA: Open and Efficient Foundation Language Models Llama 2: Open Foundation and Fine-Tuned Chat Models |
How to train LLMs from scratch? | [slide] | [LLMZoo], [LLMFactory] | Ke Ji | |
Oct. 11th | Lecture 5: Efficiency in LLMs |
Efficient Transformers: A Survey FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity Towards a Unified View of Parameter-Efficient Transfer Learning |
How to make LLMs train/inference faster? | [slide] | [llama2.c] | Junying Chen | |
Oct. 11th | Tutorial 2: train your own LLMs and assignment 2 | Are you ready to train your own LLMs? | [slide] | [Tutorial Code] [Assignment1] | Assignment 2 release | Ke Ji | |
Oct. 18th | Lecture 6: Knowledge, Reasoning, and Prompt engineering |
Natural Language Reasoning, A Survey and
others Best practices for prompt engineering with OpenAI API prompt engineering |
Can LLMs reason? how to better prompt LLMs? | [slide] | Assignment 1 due (Oct. 18, 11:59pm) | Ke Ji | |
Oct. 25th | Lecture 7: Multimodal LLMs | CLIP, MiniGPT-4, Stable Diffusion and others | Can LLMs see? | [slide] | Junying Chen | ||
Nov. 1st | Lecture 8: LLM agent |
ToolBench AgentBench Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks LLM Powered Autonomous Agents |
Can LLMs plan? | [slide] | Ke Ji | ||
Nov. 8th | Lecture 9: A Review to Spark Final Projects | N/A | N/A | [slide] | Final Project release | Junying Chen | |
Nov. 15th | Tutorial 3: Preparing your own project | How to improve your LLM applications? | [slide] | [Final Project] | Assignment 2 due (Nov. 15th, 11:59pm) | Junying Chen and Ke Ji | |
Nov. 22th | Lecture 10: LLMs in vertical domains | Large Language Models Encode Clinical Knowledge, Capabilities of GPT-4 on Medical Challenge Problems, Performance of ChatGPT on USMLE, Medical-NLP, ChatLaw | Can LLMs be mature experts like doctors/lawyers? | [slide] | [HuatuoGPT] | Junying Chen | |
Nov. 29th | Guest lectures | Geometric Deep Learning & Efficiently Democratizing Medical LLMs | [slide1] [slide2] | Yan Hu and Xidong Wang | |||
Dec. 6th | Lecture 11: Towards AGI via Test-Time Scaling |
OpenAI-O1 |
Exploring Test-Time Scaling | Junying Chen and Ke Ji | |||
Dec. 13th | Q&A Session | Q&A session for final projects | Junying Chen and Ke Ji | ||||
Dec. 20th | Poster Presentation | How to solve real-world problems using LLMs | Final Project Presentation | Junying Chen and Ke Ji |
The final project consists of two parts: Project Presentation (15%) and Project Report (40%) .
Here are some ways to earn the participation credit, which is capped at 5%.
The penalty is 0.5% off the final course grade for each late day.
We borrowed some concepts and the website template from [CSC3160/MDS6002] where Prof. Zhizheng Wu is the instructor.
Website github repo is [here] .