The knowledge coverage will be the same for session 21600 and 21601, but the lecture slides will differ. Please attend the lecture you are enrolled in.
Welcome! This is a graduate-level course on Natural Language Processing (NLP). This class introduces both foundational and modern language technologies, as well as their applications. We will begin with early statistical models, such as Naive Bayes and logistic regression, and progress to modern large language models. Along the way, we will explore a variety of core NLP tasks, such as summarization, classification, and machine translation, as well as some basic linguistic concepts.
The course will also cover different evaluation methods and commonly used metrics, as well as NLP applications in the social sciences and humanities. Beyond these, we will explore some emerging research areas such as multimodal NLP (language and vision), privacy and ethical challenges of deploying language technologies.
The goal of this class is twofold: first, to provide students with the knowledge and skills to understand and deploy language technologies; and second, to encourage a deeper appreciation of language itself and the societal implications of language technologies.
Complete independently: 5 programming assignments (30%), 6 quizzes (30%)
Complete in groups of 1-4:
Course project: initial pitch, research plan, sample data, grade contract, presentation, final report (40%)
Daniel Jurafsky and James H. Martin. 2025. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition with Language Models, 3rd edition. Online manuscript released August 24, 2025. (Draft chapters available online)
Session 21600: David Smith, email: dasmith[at]ccs.neu.edu
Session 21601: Si Wu, email: siwu[at]ccs.neu.edu
Divya Sri Bandaru, email: bandaru.di[at]northeastern[dot]edu
Tejus Dinesh, email: dinesh.te[at]northeastern[dot]edu
Zankhana Pratik Mehta, email: mehta.zan[at]northeastern[dot]edu
David Smith | Monday 11–12am, Thursday 4–5pm | Zoom link |
Si Wu | Wednesday 10am–12pm | Zoom link |
Divya Sri Bandaru | Monday 2pm–4pm | Zoom link |
Tejus Dines | Thursday 6pm–8pm | Zoom link |
Zankhana Pratik Mehta | Wednesday 3pm–5pm | Zoom link |
Assignments are due at the announced due date and time, usually 11:59 p.m. You will be granted one homework extension of four calendar days, to be used at your discretion, without having to ask. This single extension is meant to smooth over unforeseen crunches in your schedule, and you cannot simply distribute the four late days among four assignments. After the first late assignment, unexcused late assignments will be penalized 10% per calendar day late. We normally will not accept assignments after the date on which the following assignment is due or after the solutions have been handed out, whichever comes first. If you know in advance of circumstances that would cause you to turn in an assignment late, please contact the instructor before the assignment is due to ask if an extension is possible.
Please refers to the Academic Integrity Policy here.
We are committed to accommodating students with disabilities. Please contact Disability Access Services and follow the outlined procedure.
— Added policy on quizzes and suggested readings!
— Added late policy!
— Gradescope and Ed Discussion are up!
— The class website is up!
Week, date | Topics | Lecture Slides | Suggested Readings | Assignments | Others |
---|---|---|---|---|---|
1, Fri, 9/5 | Introduction and Course Logistics: Language Models in Brains and Machines |
Session 21601 Slides Session 21600 Slides |
|||
2, Tues, 9/9 | Words, Regular Expressions, and N-gram (Markov) Models |
Session 21601 Slides Session 21600 Slides |
Assignment 1 released: Empirical Regularities of Language: Evaluating Predictions and Counting Words | ||
2, Fri, 9/12 | Text Classification: Naive Bayes, Logistic Regression, and Friends |
Session 21601 Slides Session 21600 Slides |
JM 4 | ||
3, Tues, 9/16 | Word Embeddings |
Session 21601 Slides Session 21600 Slides |
JM 5 | ||
3, Fri, 9/19 | Introduction to Neural Networks |
Session 21600 Slides Session 21601 Slides |
JM 6
Goodfellow Chapter 6 MIT 6.390 Intro to ML notes |
Assignment 1 due (11:59pm), Assignment 2 released: Predictive and Interpretive Text Classification | |
4, Tues, 9/23 | Beyond Words: Morphology, Syntax, and Semantics |
Session 21600 Slides Session 21601 Slides |
JM 17 JM 19 |
Instructions for your course project pitch | |
4, Fri, 9/26 | Sequence Data, Recurrent Networks, and Attention |
Session 21601 Slides Session 21600 Slides |
JM 13 | ||
5, Tues, 9/30 | Transformers |
Session 21601 Slides Session 21600 Slides |
JM 8 | Assignment 2 due (11:59pm) | |
5, Fri, 10/3 | A Taxonomy of Large Language Models: Data, Weights, Training, and Inference |
Session 21601 Slides Session 21600 Slides |
Project pitch due (11:59pm) | ||
6, Tues, 10/7 | Guest Lecture: Alexander Spangher (postdoc @ Stanford) |
|
Assignment 3 released: Probing Transformers | ||
6, Fri, 10/10 | Pretraining |
Session 21601 Slides Session 21600 Slides |
|||
7, Tues, 10/14 | Generation Algorithms |
|
|||
7, Fri, 10/17 | Post-Training: RLHF, DPO, and Friends |
|
|||
8, Tues, 10/21 | Prompting and In-context Learning |
|
Assignment 3 due (11:59pm) | ||
8, Fri, 10/24 | Evaluation, Benchmarks, and Experimental Design |
|
|||
9, Tues, 10/28 | Retrieval, Retrieval-Augmented Generation, and Summarization |
|
|||
9, Fri, 10/31 | Multilinguality |
|
|||
10, Tues, 11/4 | Guest Lecture: Terra Blevins (Northeastern): Multilingual Encoding in LLMs |
|
|||
10, Fri, 11/7 | Beyond Individuals: Language in Social Context |
|
|||
11, Tues, 11/11 | No Class, Veterans Day |
|
|||
11, Fri, 11/14 | Guest Lecture: Niloofar Mireshghallah (META AI and CMU): Analyzing the Security and Privacy of LLMs |
|
|||
12, Tues, 11/18 | Language and Visual Context |
|
|||
12, Fri, 11/21 | Guest Lecture: Lucy Li (postdoc @ UW, and incoming assistant prof @ University of Wisconsin-Madison) |
|
|||
13, Tues, 11/25 | Final Lecture: [Student Suggested Topics] |
|
|||
13, Fri, 11/28 | No Class, Happy Thanksgiving! 🦃 |
|
|||
14, Tues, 12/2 | Project Presentations |
|
|||
14, Fri, 12/5 | Project Presentations |
|
|||
15, Thurs, 12/11 |
|
Final report due (11:59pm) |