Menu

Seminar: Deep Learning for Tabular Data

Tabular data has long been overlooked by deep learning research, despite being the most common data type in real-world machine learning applications. While deep learning methods excel on many ML applications, tabular data classification problems are still dominated by Gradient-Boosted Decision Trees. More recently, deep learning-based approaches have been proposed which showed remarkable efficiency and performance improvements. In this seminar, we will discuss this recent literature, exploring the most promising techniques and approaches for handling tabular data in deep learning.

Requirements

We require that you have taken lectures on

  • Machine Learning
  • Deep Learning

We strongly recommend that you have heard lectures on

  • Automated Machine Learning

Organization

Every week all students read the relevant literature. Two students will prepare presentations for the topic of the week and present them in the session. After each presentation, we will have time for a question & discussion round, and all participants are expected to take part in these. Each student has to write a 4-page paper (in the AutoML paper format) about their assigned topic, which will be handed in one week after their presentation.

Course type:Seminar
TimeEvery Tuesday from 14:15 - 16:00
Locationin-person; Room SR 00-006, Building 051
OrganizersHerilalaina Rakotoarison, Arbër Zela, Fabio Ferreira, Frank Hutter
RegistrationVia HISinOne
Contactdl_tabular_2023@googlegroups.com

Schedule

Date
(14:00-16:00)
TopicPresenter
18.04.2023Introduction of the topic and the available literature; how to give a good presentation
25.04.2023No meeting
02.05.2023Revisiting Deep Learning Models for Tabular Data
Why do tree-based models still outperform deep learning on tabular data?
4532136
4524354
09.05.2023Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL
Well-tuned Simple Nets Excel on Tabular Datasets
5169784
5450264
16.05.2023No meeting
23.05.2023TabNet: Attentive Interpretable Tabular Learning
Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning
5366812
4735831
30.05.2023Pfingstpause
06.06.2023No meeting
13.06.2023No meeting
Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

20.06.2023SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training
On Embeddings for Numerical Features in Tabular Deep Learning
5306261

27.06.2023Transformers Can Do Bayesian Inference
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
5362149
5252279
04.07.2023Transfer Learning with Deep Tabular Models
VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain
5577388
5367882
11.07.2023LLMs for Semi-Automated Data Science
ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data
5578916
18.07.2023TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
TabLLM: Few-shot Classification of Tabular Data with Large Language Models
ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data
5585864
5419083
5363937

Literature

A list of relevant papers can be found here.