Seminar: Deep Learning for Tabular Data

Tabular data has long been overlooked by deep learning research, despite being the most common data type in real-world machine learning applications. While deep learning methods excel on many ML applications, tabular data classification problems are still dominated by Gradient-Boosted Decision Trees. More recently, deep learning-based approaches have been proposed which showed remarkable efficiency and performance improvements. In this seminar, we will discuss this recent literature, exploring the most promising techniques and approaches for handling tabular data in deep learning.

Requirements

We require that you have taken lectures on

Machine Learning
Deep Learning

We strongly recommend that you have heard lectures on

Automated Machine Learning

Organization

Every week all students read the relevant literature. Two students will prepare presentations for the topic of the week and present them in the session. After each presentation, we will have time for a question & discussion round, and all participants are expected to take part in these. Each student has to write a 4-page paper (in the AutoML paper format) about their assigned topic, which will be handed in one week after their presentation.

Course type:	Seminar
Time	Every Tuesday from 14:15 - 16:00
Location	in-person; Room SR 00-006, Building 051
Organizers	Herilalaina Rakotoarison, Arbër Zela , Fabio Ferreira, Frank Hutter
Registration	Via HISinOne
Contact	dl_tabular_2023@googlegroups.com

Schedule

Date (14:00-16:00)	Topic	Presenter
18.04.2023	Introduction of the topic and the available literature; how to give a good presentation
25.04.2023	No meeting
02.05.2023	Revisiting Deep Learning Models for Tabular Data Why do tree-based models still outperform deep learning on tabular data?	4532136 4524354
09.05.2023	Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL Well-tuned Simple Nets Excel on Tabular Datasets	5169784 5450264
16.05.2023	No meeting
23.05.2023	TabNet: Attentive Interpretable Tabular Learning Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning	5366812 4735831
30.05.2023	Pfingstpause
06.06.2023	No meeting
13.06.2023	No meeting Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data
20.06.2023	SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training ~~On Embeddings for Numerical Features in Tabular Deep Learning~~	5306261
27.06.2023	Transformers Can Do Bayesian Inference TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second	5362149 5252279
04.07.2023	Transfer Learning with Deep Tabular Models VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain	5577388 5367882
11.07.2023	LLMs for Semi-Automated Data Science ~~ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data~~	5578916
18.07.2023	TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data TabLLM: Few-shot Classification of Tabular Data with Large Language Models ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data	5585864 5419083 5363937

Literature

A list of relevant papers can be found here.

Machine Learning Lab

Seminar: Deep Learning for Tabular Data

Requirements

Organization

Schedule

Literature

Slides