Tabular data has long been overlooked by deep learning research, despite being the most common data type in real-world machine learning applications. While deep learning methods excel on many ML applications, tabular data classification problems are still dominated by Gradient-Boosted Decision Trees. More recently, deep learning-based approaches have been proposed which showed remarkable efficiency and performance improvements. In this seminar, we will discuss this recent literature, exploring the most promising techniques and approaches for handling tabular data in deep learning.
Requirements
We require that you have taken lectures on
- Machine Learning
- Deep Learning
We strongly recommend that you have heard lectures on
- Automated Machine Learning
Organization
Every week all students read the relevant literature. Two students will prepare presentations for the topic of the week and present them in the session. After each presentation, we will have time for a question & discussion round, and all participants are expected to take part in these. Each student has to write a 4-page paper (in the AutoML paper format) about their assigned topic, which will be handed in one week after their presentation.
Course type: | Seminar |
Time | Every Tuesday from 14:15 - 16:00 |
Location | in-person; Room SR 00-006, Building 051 |
Organizers | Herilalaina Rakotoarison, Arbër Zela, Fabio Ferreira, Frank Hutter |
Registration | Via HISinOne |
Contact | dl_tabular_2023@googlegroups.com |
Schedule
Date (14:00-16:00) | Topic | Presenter | Literature |
18.04.2023 | Introduction of the topic and the available literature; how to give a good presentation | ||
25.04.2023 | No meeting | ||
02.05.2023 | Revisiting Deep Learning Models for Tabular Data Why do tree-based models still outperform deep learning on tabular data? | 4532136 4524354 | |
09.05.2023 | Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL Well-tuned Simple Nets Excel on Tabular Datasets | 5169784 5450264 | |
16.05.2023 | No meeting | ||
23.05.2023 | TabNet: Attentive Interpretable Tabular Learning Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning | 5366812 4735831 | |
30.05.2023 | Pfingstpause | ||
06.06.2023 | No meeting | ||
13.06.2023 | Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data | ||
20.06.2023 | SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training On Embeddings for Numerical Features in Tabular Deep Learning | 5306261 5168281 | |
27.06.2023 | Transformers Can Do Bayesian Inference TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second | 5362149 5252279 | |
04.07.2023 | Transfer Learning with Deep Tabular Models VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain | 5577388 5367882 | |
11.07.2023 | LLMs for tabular data feature engineering ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data | 5578916 5363937 | |
18.07.2023 | TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data TabLLM: Few-shot Classification of Tabular Data with Large Language Models | 5585864 5419083 |
Literature
A list of relevant papers can be found here.