CS 6120: NLP Course Projects

Northeastern University, Fall 2025

Project Stages

As discussed in the first class, you will submit and revise your work on the course project in several stages:

  1. initial pitch
  2. research plan
  3. sample data
  4. grade contract
  5. oral presentation
  6. final report

On this page, we'll post instructions for each stage incrementally as the course progresses.

Initial Pitch (Due Friday 3 October 11:59pm)

One semester is not nearly enough to cover all the methods, tasks, and data in the field of natural language processing. The course project is a chance to focus on one or a small number of tasks, models, or approaches that you find interesting. You may work in teams of one to four students. Although it would be easier to work with others in the same section, it is permitted to collaborate across sections 21600 and 21601.

The first step in your project is to submit an initial pitch, with two key pieces of information:

  1. a list of the members of your project team; and
  2. one paragraph of five to ten sentences, describing the NLP approach you will implement and the task and data you will evaluate on.

Please submit this information as a PDF on Gradescope. We will then give feedback on the scope of your pitch, commenting on whether it seems to narrow or too broad for a team of your size, and adding further suggestions for refinement.

Project Goals and Topics

To give you some context for the initial pitch, here is some information on the goals of the project and possible topics. There are three key components of any project in this course:

  1. annotating natural language data,
  2. performing natural language inference, and
  3. evaluating performance.

Annotating natural language data: Although there are many datasets available, students in this course should have experience looking at natural language data and annotating it to help train and evaluate models. In most cases, you will supplement data you annotate with other datasets, but annotating data yourself helps you think through the problem and evaluation. If you are defining a new task, you might spend more time on creating new evaluation data. If you are evaluating a new approach to an old task, however, you should still create some evaluation data to help develop your intuition about the task.

Performing natural language inference: Choose a task or define a new one. Choose models appropriate for your task.

Evaluating performance: Choose an evaluation protocol and evaluation metric(s) suitable for your task. Analyze the results of an NLP system to help find strengths and weaknesses, or kinds of data where it performs better or worse.

We will discuss these steps in more detail in future project stages.

Here are some suggestions for general directions your project might take. They are only suggestions, and you do not need to choose from this list.