CS40008.01 NLP & LLMs

Spring 2026 @ Fudan University

View the Project on GitHub baojian/llm-26

Outline


Course Overview

Introduction

This course covers the foundations and modern frontiers of Natural Language Processing (NLP), with a heavy emphasis on Large Language Models (LLMs). You will learn the modern pipeline of building effective LLMs from basic tokenization to training, fine-tuning, and deploying modern LLM architectures.

Basic Info


Assignments and Course Project

All standard homework assignments are completed by Week 12. The final month (Weeks 13–16) is dedicated exclusively to the Course Project. Please submit your homework at https://elearning.fudan.edu.cn/

Assignment 1. Foundations of Text

Course Project


Coursework

Resources and References


GPU Resources


Weekly Schedule

Week 1 Introduction to LLMs

In our first lecture, we introduce text preprocessing, including tokenization (BPE/WordPiece), and vocab design.

Release Assignment 1

Week 2 N-Gram Language Models

In this lecture, we introduce the concept of MLE, Smoothing, Perplexity, and Language Modeling basics.

Week 3 Word Embeddings

In this lecture, we introduce text classification, Word2Vec, Distributional Hypothesis, and Intrinsic/Extrinsic evaluations.

Week 4 Neural LMs

In this lecture, we introduce neural networks and how to build NN models for sequence learning problems. We will discuss some classic models like LSTM and how the encoder-decoder style models developed and why the attention is a effective component adding to encoder-decoder model.

Week 5 & 6 Attention Mechanisms and Transformer

In this lecture, we introduce the Transformer architecture.

Week 7 LLM Pretraining (GPT)

In this lecture, we introduce tpyical pretrained LLMs such as GPT-series.

Week 8 Evaluations and Benchmarks

In this lecture, we introduce evaluation datasets and tasks for LLMs.

Project Proposal Due

Week 9 BERT and Post-training

In this lecture, we introduce BERT and the bidirectional Transformer encoder architecture trained via masked language modeling. Unlike causal LMs, BERT produces contextual embeddings that can be fine-tuned for downstream tasks such as classification and named entity recognition.

Week 10 Post-training (SFT, RM, and PPO)

In this lecture, we introduce post-training techniques including supervised fine-tuning (SFT), reward modeling (RM), and reinforcement learning from human feedback via PPO. We also cover alignment, instruction tuning, and test-time compute scaling.

Week 11 Information Retrieval and Retrieval-Augmented Generation

In this lecture, we introduce dense retrieval and Retrieval-Augmented Generation (RAG). RAG augments a language model with a non-parametric memory — a retrievable document index — so that answers can be grounded in up-to-date, verifiable sources without retraining the model.

Week 12 Course Project Presentation

Each group presents preliminary results in class on 05/21/2026.

  • Format: up to 8 minutes per group — 7 min presentation + 1 min Q&A
  • Content: related work investigation, your approach, and preliminary results
  • Full project details

Final Report — Due 06/25/2026, 23:59 (Week 17)

  • Written in English or Chinese
  • Maximum 7 pages of main content (excluding references)
  • Format: ACL template
  • Submit via elearning

Grading breakdown

  • Proposal (5%): clarity, feasibility, relevance, innovation
  • Presentation (20%): clarity, related work coverage
  • Programming & algorithm (25%): reasonableness and soundness
  • Performance (20%): results and analysis
  • Report (30%): organization, analysis, discussion

Week 13 Diffusion Language Models

Week 14 Alignment & Safety

RLHF (PPO/DPO), Safety barriers, Red-teaming

Week 15 Efficiency & Systems

KV Caching, Quantization (Int8/FP4), Latency/Throughput

Week 16 Agents and Frontiers

Multimodal LLMs, Diffusion LMs, Future Directions