Tabular data has long been overlooked by deep learning research, despite being the most common data type in real-world machine learning applications. While deep learning methods excel on many ML applications, tabular data classification problems are still dominated by Gradient-Boosted Decision Trees. More recently, deep learning-based approaches have been proposed which showed remarkable efficiency and performance improvements. In this seminar, we will discuss this recent literature, exploring the most promising techniques and approaches for handling tabular data in deep learning.
We require that you have taken lectures on
- Machine Learning
- Deep Learning
We strongly recommend that you have heard lectures on
- Automated Machine Learning
Every week all students read the relevant literature. Two students will prepare presentations for the topic of the week and present them in the session. After each presentation, we will have time for a question & discussion round, and all participants are expected to take part in these. Each student has to write a 4-page paper (in the AutoML paper format) about their assigned topic, which will be handed in one week after their presentation.
|Time||Every Tuesday from 14:15 - 16:00|
|Location||in-person; Room SR 00-006, Building 051|
|Organizers||Herilalaina Rakotoarison, Arbër Zela, Fabio Ferreira, Frank Hutter|
|18.04.2023||Introduction of the topic and the available literature; how to give a good presentation|
|02.05.2023||Revisiting Deep Learning Models for Tabular Data|
Why do tree-based models still outperform deep learning on tabular data?
|09.05.2023||Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL|
Well-tuned Simple Nets Excel on Tabular Datasets
|23.05.2023||TabNet: Attentive Interpretable Tabular Learning|
Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning
|13.06.2023||Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data|
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data
|20.06.2023||SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training|
On Embeddings for Numerical Features in Tabular Deep Learning
|27.06.2023||Transformers Can Do Bayesian Inference|
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
|04.07.2023||Transfer Learning with Deep Tabular Models|
VIME: Extending the Success of Self- and Semi-supervised Learning to Tabular Domain
|11.07.2023||LLMs for tabular data feature engineering|
ExcelFormer: A Neural Network Surpassing GBDTs on Tabular Data
|18.07.2023||TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data|
TabLLM: Few-shot Classification of Tabular Data with Large Language Models
A list of relevant papers can be found here.