Uni-Logo

ERC Grant Data-Driven Methods for Modelling and Optimizing the Empirical Performance of Deep Neural Networks (BeyondBlackbox)

Deep neural networks (DNNs) have led to dramatic improvements of the state-of-the-art for many important classification problems, such as object recognition from images or speech recognition from audio data. However, DNNs are also notoriously dependent on the tuning of their hyperparameters. Since their manual tuning is time-consuming and requires expert knowledge, recent years have seen the rise of Bayesian optimization methods for automating this task. While these methods have had substantial successes, the treatment of DNN performance as a black box poses fundamental limitations, allowing manual tuning to be more effective for large and computationally expensive data sets: humans can (1) exploit prior knowledge and extrapolate performance from data subsets, (2) monitor the DNN’s internal weight optimization by stochastic gradient descent over time, and (3) reactively change hyperparameters at runtime. We therefore propose to model DNN performance beyond a blackbox level and to use these models to develop for the first time:

  1. Next-generation Bayesian optimization methods that exploit data-driven priors to optimize performance orders of magnitude faster than currently possible;

  2. Graybox Bayesian optimization methods that have access to – and exploit – performance and state information of algorithm runs over time; and

  3. Hyperparameter control strategies that learn across different datasets to adapt hyperparameters reactively to the characteristics of any given situation.

DNNs play into our project in two ways. First, in all our methods we will use (Bayesian) DNNs to model and exploit the large amounts of performance data we will collect on various datasets. Second, our application goal is to optimize and control DNN hyperparameters far better than human experts and to obtain:

  1. Computationally inexpensive auto-tuned deep neural networks, even for large datasets, enabling the widespread use of deep learning by non-experts.

Research funded by the ERC Grant

2019

  • Ying, C. and Klein, A. and Real, E. and Christiansen E. and Murphy, K. and Hutter, F. (pdf)(bib)
    NAS-Bench-101: Towards Reproducible Neural Architecture Search
    In: International Conference on Machine Learning (ICML)
  • Elsken, T. and Metzen, J.H. and Hutter, F. (pdf)(bib)
    Neural Architecture Search: A Survey
    In: Journal of Machine Learning Research (JMLR)
  • Elsken, T and Metzen, J. H. and Hutter, F. (published)(bib)
    Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution
    In: International Conference on Learning Representations (ICLR 2019)
  • Runge, F. and Stoll, D. and Falkner, S. and Hutter, F. (pdf)(bib)
    Learning to Design RNA
    In: International Conference on Learning Representations (ICLR 2019)
  • Loshchilov, I. and Hutter, F. (published)(bib)
    Decoupled Weight Decay Regularization
    In: International Conference on Learning Representations (ICLR 2019)
  • Gargiani, M. and Klein, A. Falkner, S. and Hutter, F. (pdf)(bib)
    Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings
    In: ICML 2019 AutoML Workshop
  • Hutter, F. and Kotthoff, L. and Vanschoren, J. (pdf)(bib)
    Automated Machine Learning: Methods, Systems, Challenges
  • 2018

  • Chrabaszcz, P. and Loshchilov, I. and Hutter, F. (pdf)(bib)
    Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari
    In: International Joint Conference on Artificial Intelligence (IJCAI)

  • Falkner, S. and Klein, A. and Hutter, F. (pdf)(bib)
    BOHB: Robust and Efficient Hyperparameter Optimization at Scale
    In: International Conference on Machine Learning (ICML)
  • Falkner, S. and Klein, A. and Hutter, F. (pdf)(bib)
    Combining Hyperband and Bayesian Optimization
    In: NIPS 2017 Bayesian Optimization Workshop
  • Feurer, M. and Eggensperger, K. and Falkner, S. and Lindauer, M. and Hutter, F. (pdf)(bib)
    Practical Automated Machine Learning for the AutoML Challenge 2018
    In: ICML 2018 AutoML Workshop
  • Feurer, M. and Hutter, F. (pdf)(bib)
    Towards Further Automation in AutoML
    In: ICML 2018 AutoML Workshop
  • Zela, A. and Klein, A. and Falkner, S. and Hutter, F. (pdf)(poster)(bib)
    Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search
    In: ICML 2018 AutoML Workshop
  • Ilg, E. and Cicek, O. and Galesso, S. and Klein, A. and Makansi, O. and Hutter, F. and Brox, T. (arXiv)(pdf)(bib)
    Uncertainty Estimates for Optical Flow with Multi-Hypotheses Networks
    In: Proceedings of ECCV 2018
  • van Rijn, J.N. and Hutter, F. (arXiv)(published)(bib)
    Hyperparameter Importance Across Datasets
    In: SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2018)
  • Wilson, J. and Hutter, F. and Deisenroth, M. P. (published)(bib)
    Maximizing Acquisition Functions for Bayesian Optimization
    In: Advances in Neural Information Processing Systems 31
  • 2017

  • Chrabaszcz, P. and Loshchilov, I. and Hutter, F. (pdf)
    A Downsampled Variant of ImageNet as an Alternative to the CIFAR datasets
    In: arXiv
  • Klein, A. and Falkner, S. and Mansur, N. and Hutter, F. (pdf)(bib)
    RoBO: A Flexible and Robust Bayesian Optimization Framework in Python
    In: NIPS 2017 Bayesian Optimization Workshop
  • Bischl, B. and Casalicchio, G. and Feurer, M. and Hutter, F. and Lang, M. and Mantovani, R. G. and van Rijn, J. N. and Vanschoren, J. (arXiv)(bib)
    OpenML Benchmarking Suites and the OpenML100
    In: arXiv 1708.0373 (2017): 1-6
  • Greff, K. and Klein, A. and Chovanec, M. and Hutter, F. and Schmidhuber, J. (pdf)(bib)
    The Sacred Infrastructure for Computational Research
    In: Proceedings of the 15th Python in Science Conference (SciPy 2017)
  • Klein, A. and Falkner, S. and Springenberg, J. T. and Hutter, F. (pdf)(bib)
    Learning Curve Prediction with Bayesian Neural Networks
    In: International Conference on Learning Representations (ICLR) 2017 Conference Track
  • van Rijn, J. N. and Hutter, F. (pdf)(bib)
    An Empirical Study of Hyperparameter Importance Across Datasets
    In: Proceedings of the International Workshop on Automatic Selection, Configuration and Composition of Machine Learning Algorithms (AutoML 2017)
  • Klein, A. and Falkner, S. and Bartels, S. and Hennig, P. and Hutter, F. (pdf)(bib)
    Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets
    In: Proceedings of the AISTATS conference
  • Klein, A. and Falkner, S. and Bartels, S. and Hennig, P. and Hutter, F. (pdf)(bib)
    Fast Bayesian Hyperparameter Optimization on Large Datasets
    In: Electronic Journal of Statistics
  • Wilson, J. and Moriconi, R. and Hutter, F. and Deisenroth, M. P. (published)(bib)
    The Reparameterization Trick for Acquisition Functions
    In: Proceedings of BayesOpt 2017