ERC Grant DataDriven Methods for Modelling and Optimizing the Empirical Performance of Deep Neural Networks (BeyondBlackbox)
Deep neural networks (DNNs) have led to dramatic improvements of the stateoftheart for many important classification problems, such as object recognition from images or speech recognition from audio data. However, DNNs are also notoriously dependent on the tuning of their hyperparameters. Since their manual tuning is timeconsuming and requires expert knowledge, recent years have seen the rise of Bayesian optimization methods for automating this task. While these methods have had substantial successes, the treatment of DNN performance as a black box poses fundamental limitations, allowing manual tuning to be more effective for large and computationally expensive data sets: humans can (1) exploit prior knowledge and extrapolate performance from data subsets, (2) monitor the DNN’s internal weight optimization by stochastic gradient descent over time, and (3) reactively change hyperparameters at runtime. We therefore propose to model DNN performance beyond a blackbox level and to use these models to develop for the first time:

Nextgeneration Bayesian optimization methods that exploit datadriven priors to optimize performance orders of magnitude faster than currently possible;

Graybox Bayesian optimization methods that have access to – and exploit – performance and state information of algorithm runs over time; and

Hyperparameter control strategies that learn across different datasets to adapt hyperparameters reactively to the characteristics of any given situation.
DNNs play into our project in two ways. First, in all our methods we will use (Bayesian) DNNs to model and exploit the large amounts of performance data we will collect on various datasets. Second, our application goal is to optimize and control DNN hyperparameters far better than human experts and to obtain:
 Computationally inexpensive autotuned deep neural networks, even for large datasets, enabling the widespread use of deep learning by nonexperts.
Research funded by the ERC Grant
2019
NASBench101: Towards Reproducible Neural Architecture Search
In: International Conference on Machine Learning (ICML)
Neural Architecture Search: A Survey
In: Journal of Machine Learning Research (JMLR)
Efficient MultiObjective Neural Architecture Search via Lamarckian Evolution
In: International Conference on Learning Representations (ICLR 2019)
Learning to Design RNA
In: International Conference on Learning Representations (ICLR 2019)
Decoupled Weight Decay Regularization
In: International Conference on Learning Representations (ICLR 2019)
Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings
In: ICML 2019 AutoML Workshop
Automated Machine Learning: Methods, Systems, Challenges
2018
Chrabaszcz, P. and Loshchilov, I. and Hutter, F. (pdf)(bib)
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari
In: International Joint Conference on Artificial Intelligence (IJCAI)
BOHB: Robust and Efficient Hyperparameter Optimization at Scale
In: International Conference on Machine Learning (ICML)
Combining Hyperband and Bayesian Optimization
In: NIPS 2017 Bayesian Optimization Workshop
Practical Automated Machine Learning for the AutoML Challenge 2018
In: ICML 2018 AutoML Workshop
Towards Further Automation in AutoML
In: ICML 2018 AutoML Workshop
Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search
In: ICML 2018 AutoML Workshop
Uncertainty Estimates for Optical Flow with MultiHypotheses Networks
In: Proceedings of ECCV 2018
Hyperparameter Importance Across Datasets
In: SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2018)
Maximizing Acquisition Functions for Bayesian Optimization
In: Advances in Neural Information Processing Systems 31
2017
A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets
In: arXiv
RoBO: A Flexible and Robust Bayesian Optimization Framework in Python
In: NIPS 2017 Bayesian Optimization Workshop
OpenML Benchmarking Suites and the OpenML100
In: arXiv 1708.0373 (2017): 16
The Sacred Infrastructure for Computational Research
In: Proceedings of the 15th Python in Science Conference (SciPy 2017)
Learning Curve Prediction with Bayesian Neural Networks
In: International Conference on Learning Representations (ICLR) 2017 Conference Track
An Empirical Study of Hyperparameter Importance Across Datasets
In: Proceedings of the International Workshop on Automatic Selection, Configuration and Composition of Machine Learning Algorithms (AutoML 2017)
Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets
In: Proceedings of the AISTATS conference
Fast Bayesian Hyperparameter Optimization on Large Datasets
In: Electronic Journal of Statistics
The Reparameterization Trick for Acquisition Functions
In: Proceedings of BayesOpt 2017