Menu

ERC Grant

Data-Driven Methods for Modelling and Optimizing the Empirical Performance of Deep Neural Networks (BeyondBlackBox)

Deep neural networks (DNNs) have led to dramatic improvements of the state-of-the-art for many important classification problems, such as object recognition from images or speech recognition from audio data. However, DNNs are also notoriously dependent on the tuning of their hyperparameters. Since their manual tuning is time-consuming and requires expert knowledge, recent years have seen the rise of Bayesian optimization methods for automating this task. While these methods have had substantial successes, the treatment of DNN performance as a black box poses fundamental limitations, allowing manual tuning to be more effective for large and computationally expensive data sets: humans can (1) exploit prior knowledge and extrapolate performance from data subsets, (2) monitor the DNN’s internal weight optimization by stochastic gradient descent over time, and (3) reactively change hyperparameters at runtime. We therefore propose to model DNN performance beyond a blackbox level and to use these models to develop for the first time:

  1. Next-generation Bayesian optimization methods that exploit data-driven priors to optimize performance orders of magnitude faster than currently possible;
  2. Graybox Bayesian optimization methods that have access to – and exploit – performance and state information of algorithm runs over time; and
  3. Hyperparameter control strategies that learn across different datasets to adapt hyperparameters reactively to the characteristics of any given situation.

DNNs play into our project in two ways. First, in all our methods we will use (Bayesian) DNNs to model and exploit the large amounts of performance data we will collect on various datasets. Second, our application goal is to optimize and control DNN hyperparameters far better than human experts and to obtain:

  1. Computationally inexpensive auto-tuned deep neural networks, even for large datasets, enabling the widespread use of deep learning by non-experts.

Publications for this project

Lindauer, Marius; Eggensperger, Katharina; Feurer, Matthias; Biedenkapp, André; Deng, Difan; Benjamins, Carolin; Ruhkopf, Tim; Sass, René; Hutter, Frank

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization Journal Article

In: Journal of Machine Learning Research (JMLR) -- MLOSS, vol. 23, no. 54, pp. 1-9, 2022.

Bischl, Bernd; Casalicchio, Giuseppe; Feurer, Matthias; Gijsbers, Pieter; Hutter, Frank; Lang, Michel; Mantovani, Rafael G; van Rijn, Jan N; Vanschoren, Joaquin

OpenML Benchmarking Suites Inproceedings

In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 2021.

Eggensperger, Katharina; Müller, Philipp; Mallik, Neeratyoy; Feurer, Matthias; Sass, René; Klein, Aaron; Awad, Noor; Lindauer, Marius; Hutter, Frank

HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO Inproceedings

In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 2021.

Feurer, Matthias; Eggensperger, Katharina; Falkner, Stefan; Lindauer, Marius; Hutter, Frank

Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning Journal Article

In: arXiv:2007.04074v2, 2021.

Narayanan, Ashwin Raaghav; Zela, Arber; Saikia, Tonmoy; Brox, Thomas; Hutter, Frank

Multi-headed Neural Ensemble Search Inproceedings

In: Workshop on Uncertainty and Robustness in Deep Learning (UDL@ICML`21), 2021.

Colin White Shen Yan, Yash Savani; Hutter, Frank

NAS-Bench-x11 and the Power of Learning Curves Inproceedings

In: Proceedings of the CVPR 2021 Workshop on Neural Architecture Search (CVPR-NAS '21), 2021.

Elsken, Thomas; Staffler, Benedikt; Zela, Arber; Metzen, Jan Hendrik; Hutter, Frank

Bag of Tricks for Neural Architecture Search Journal Article

In: Proceedings of the CVPR 2021 Workshop on Neural Architecture Search (CVPR-NAS '21), 2021.

Zaidi, Sheheryar; Zela, Arber; Elsken, Thomas; Holmes, Christopher C.; Hutter, Frank; Teh, Yee Whye

Neural Ensemble Search for Uncertainty Estimation and Dataset Shift Inproceedings

In: Thirty-Fifth Conference on Neural Information Processing Systems, 2021.

Yan, Shen; White, Colin; Savani, Yash; Hutter, Frank

NAS-Bench-x11 and the Power of Learning Curves Inproceedings

In: Thirty-Fifth Conference on Neural Information Processing Systems, 2021.

Kadra, Arlind; Lindauer, Marius; Hutter, Frank; Grabocka, Josif

Well-tuned Simple Nets Excel on Tabular Datasets Inproceedings

In: Thirty-Fifth Conference on Neural Information Processing Systems, 2021.

White, Colin; Zela, Arber; Ru, Binxin; Liu, Yang; Hutter, Frank

How Powerful are Performance Predictors in Neural Architecture Search? Inproceedings

In: Thirty-Fifth Conference on Neural Information Processing Systems, 2021.

Franke, Jörg K H; Köhler, Gregor; Biedenkapp, André; Hutter, Frank

Sample-Efficient Automated Deep Reinforcement Learning Journal Article

In: International Conference on Learning Representations (ICLR) 2021, 2021.

Zhang, Baohe; Rajan, Raghu; Pineda, Luis; Lambert, Nathan; Biedenkapp, André; Chua, Kurtland; Hutter, Frank; Calandra, Roberto

On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning Inproceedings

In: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS)'21, 2021.

Müller, Samuel; Hutter, Frank

TrivialAugment: Tuning-free Yet State-of-the-Art Data Augmentation Inproceedings

In: ICCV, 2021, (Oral Presentation (Top 3%)).

Souza, Artur; Nardi, Luigi; Oliveira, Leonardo; Olukotun, Kunle; Lindauer, Marius; Hutter, Frank

Bayesian Optimization with a Prior for the Optimum Inproceedings

In: Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), 2021.

Zimmer, Lucas; Lindauer, Marius; Hutter, Frank

Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL Journal Article

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1-1, 2021.

Feurer, Matthias; van Rijn, Jan N; Kadra, Arlind; Gijsbers, Pieter; Mallik, Neeratyoy; Ravi, Sahithya; Müller, Andreas; Vanschoren, Joaquin; Hutter, Frank

OpenML-Python: an extensible Python API for OpenML Journal Article

In: Journal of Machine Learning Research, vol. 22, no. 100, pp. 1-5, 2021.

Stoll, Danny; Franke, Jörg K H; Wagner, Diane; Selg, Simon; Hutter, Frank

Hyperparameter Transfer Across Developer Adjustments Journal Article

In: NeurIPS 4th Workshop on Meta-Learning, 2020.

Liu, Zhengying; Pavao, Adrien; Xu, Zhen; Escalera, Sergio; Ferreira, Fabio; Guyon, Isabelle; Hong, Sirui; Hutter, Frank; Ji, Rongrong; Junior, Julio C S Jacques; Li, Ge; Lindauer, Marius; Luo, Zhipeng; Madadi, Meysam; Nierhoff, Thomas; Niu, Kangning; Pan, Chunguang; Stoll, Danny; Treguer, Sebastien; Wang, Jin; Wang, Peng; Wu, Chenglin; Xiong, Youcheng; Zela, Arber; Zhang, Yang

Winning Solutions and Post-Challenge Analyses of the ChaLearn AutoDL Challenge 2019 Journal Article

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 9, pp. 3108-3125, 2020.

Eggensperger, Katharina; Haase, Kai; Müller, Philipp; Lindauer, Marius; Hutter, Frank

Neural Model-based Optimization with Right-Censored Observations Journal Article

In: arXiv:2009:13828 [cs.AI], 2020.

Elsken, Thomas; Staffler, Benedikt; Metzen, Jan Hendrik; Hutter, Frank

Meta-Learning of Neural Architectures for Few-Shot Learning Inproceedings

In: The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, (Oral Presentation (Top 6%)).

Gargiani, Matilde; Zanelli, Andrea; Diehl, Moritz; Hutter, Frank

On the Promise of the Stochastic Generalized Gauss-Newton Method for Training DNNs Journal Article

In: arXiv:2006.02409 [cs.LG], 2020.

Zimmer, Lucas; Lindauer, Marius; Hutter, Frank

Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL Journal Article

In: arXiv:2006.13799 [cs.LG], 2020.

Zela, Arber; Siems, Julien; Hutter, Frank

NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search Inproceedings

In: International Conference on Learning Representations, 2020.

Zela, Arber; Elsken, Thomas; Saikia, Tonmoy; Marrakchi, Yassine; Brox, Thomas; Hutter, Frank

Understanding and Robustifying Differentiable Architecture Search Inproceedings

In: International Conference on Learning Representations, 2020, (Oral Presentation (Top 7%)).

Gargiani, Matilde; Zanelli, Andrea; Tran-Dinh, Quoc; Diehl, Moritz; Hutter, Frank

Transferring Optimally Across Data Distrutions via Homotopy Methods Inproceedings

In: International Conference on Learning Representations, 2020.

Gargiani, M; Klein, A; Falkner, S; Hutter, F

Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings Inproceedings

In: 6th ICML Workshop on Automated Machine Learning, 2019.

Elsken, Thomas; Metzen, Jan Hendrik; Hutter, Frank

Neural Architecture Search: A Survey Journal Article

In: Journal of Machine Learning Research, vol. 20, no. 55, pp. 1-21, 2019.

Franke, Jörg KH; Köhler, Gregor; Awad, Noor; Hutter, Frank

Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control Journal Article

In: NeurIPS 2019 Workshop on Meta-Learning, 2019.

Elsken, Thomas; Metzen, Jan Hendrik; Hutter, Frank

Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution Inproceedings

In: International Conference on Learning Representations, 2019.

Hutter, Frank; Kotthoff, Lars; Vanschoren, Joaquin (Ed.)

Automated Machine Learning - Methods, Systems, Challenges Book

Springer, 2019.

Runge, Frederic; Stoll, Danny; Falkner, Stefan; Hutter, Frank

Learning to Design RNA Inproceedings

In: International Conference on Learning Representations, 2019.

Loshchilov, Ilya; Hutter, Frank

Decoupled Weight Decay Regularization Inproceedings

In: International Conference on Learning Representations, 2019.

Ying, Chris; Klein, Aaron; Real, Esteban; Christiansen, Eric; Murphy, Kevin; Hutter, Frank

Nas-bench-101: Towards reproducible neural architecture search Inproceedings

In: Thirty-sixth International Conference on Machine Learning, 2019.

Feurer, Matthias; Eggensperger, Katharina; Falkner, Stefan; Lindauer, Marius; Hutter, Frank

Practical Automated Machine Learning for the AutoML Challenge 2018 Inproceedings

In: ICML 2018 AutoML Workshop, 2018.

Feurer, M; Hutter, F

Towards Further Automation in AutoML Inproceedings

In: ICML 2018 AutoML Workshop, 2018.

Zela, Arber; Klein, Aaron; Falkner, Stefan; Hutter, Frank

Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search Inproceedings

In: ICML 2018 AutoML Workshop, 2018.

Chrabąszcz, Patryk; Loshchilov, Ilya; Hutter, Frank

Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari Inproceedings

In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 1419–1426, International Joint Conferences on Artificial Intelligence Organization, 2018.

Falkner, Stefan; Klein, Aaron; Hutter, Frank

BOHB: Robust and Efficient Hyperparameter Optimization at Scale Inproceedings

In: Proceedings of the 35th International Conference on Machine Learning (ICML 2018), pp. 1436–1445, 2018.

Ilg, Eddy; Cicek, Oezguen; Galesso, Silvio; Klein, Aaron; Makansi, Osama; Hutter, Frank; Brox, Thomas

Uncertainty Estimates for Optical Flow with Multi-Hypotheses Networks Journal Article

In: Proceedings of ECCV 2018, 2018.

Wilson, James; Hutter, Frank; Deisenroth, Marc

Maximizing acquisition functions for Bayesian optimization Inproceedings

In: Bengio, S; Wallach, H; Larochelle, H; Grauman, K; Cesa-Bianchi, N; Garnett, R (Ed.): Advances in Neural Information Processing Systems 31, pp. 9906–9917, Curran Associates, Inc., 2018.

van Rijn, J N; Hutter, F

Hyperparameter Importance Across Datasets Journal Article

In: SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2018), 2018.

Klein, A; Falkner, S; Mansur, N; Hutter, F

RoBO: A Flexible and Robust Bayesian Optimization Framework in Python Inproceedings

In: NIPS 2017 Bayesian Optimization Workshop, 2017.

Falkner, S; Klein, A; Hutter, F

Combining Hyperband and Bayesian Optimization Inproceedings

In: NIPS 2017 Bayesian Optimization Workshop, 2017.

Bischl, Bernd; Casalicchio, Giuseppe; Feurer, Matthias; Hutter, Frank; Lang, Michel; Mantovani, Rafael G; van Rijn, Jan N; Vanschoren, Joaquin

OpenML Benchmarking Suites and the OpenML100 Journal Article

In: arXiv, vol. 1708.0373v1, pp. 1-6, 2017.

Greff, K; Klein, A; Chovanec, M; Hutter, F; Schmidhuber, J

The Sacred Infrastructure for Computational Research Inproceedings

In: Proceedings of the 15th Python in Science Conference (SciPy 2017), 2017.

Klein, A; Falkner, S; Springenberg, J T; Hutter, F

Learning Curve Prediction with Bayesian Neural Networks Inproceedings

In: International Conference on Learning Representations (ICLR) 2017 Conference Track, 2017.

Klein, A; Falkner, S; Bartels, S; Hennig, P; Hutter, F

Fast Bayesian hyperparameter optimization on large datasets Inproceedings

In: Electronic Journal of Statistics, 2017.

van Rijn, J N; Hutter, F

An Empirical Study of Hyperparameter Importance Across Datasets Inproceedings

In: Proceedings of the International Workshop on Automatic Selection, Configuration and Composition of Machine Learning Algorithms (AutoML 2017), pp. 97–104, 2017.

Wilson, James T.; Moriconi, Riccardo; Hutter, Frank; Deisenroth, Marc P.

The Reparameterization Trick for Acquisition Functions Inproceedings

In: NIPS Workshop on Bayesian Optimization, 2017.

Chrabaszcz, Patryk; Loshchilov, Ilya; Hutter, Frank

A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets Miscellaneous

2017.