Menu

Seminar: Pruning and Efficiency in LLMs

Prerequisite

We require that you have taken lectures on or are familiar with the following:

  • Machine Learning
  • Deep Learning
  • Automated Machine Learning

Organization

After the kick-off meeting, everyone is assigned a paper (one or two depending on the content). Then, everyone understands the paper(s) assigned to them and prepares two presentations.

  • The first presentation will focus on establishing, the background, motivation for the work and a concise overview of the approach proposed in the paper
  • The second presentation will focus on the details of the approach, the results and takeaways from the paper and an “add-on” described below

Students will contribute an "add-on" related to the paper for the final report. This includes but is not limited to a thorough literature review, reproducing some experiments, profiling inference latency of the LLMs, implementing a part of the paper or providing a colab demo on applying the method in the paper to a different LLM. Students can (e-)meet with Rhea Sukthanker for feedback and any questions (e.g., to discuss a potential "add-on").


Grading

  • Presentations: 50% (two times 25min + 15min Q&A)
  • Report: 30% (4 pages in AutoML Conf format, due one week after last end term)
  • Add-on: 20%

List of Potential Papers