Uni-Logo

Automated Machine Learning

Jointly with Marius Lindauer, Bernd Bischl and Lars Kotthoff we have created a MOOC of our AutoML lecture.
The course is freely available at the AI-Campus: https://ki-campus.org/courses/automl-luh2021

Which topics will be covered?

  • Hyperparameter Optimization
  • Gaussian processes
  • Bayesian optimization
  • Multi-criteria optimization
  • Neural Architecture Search
  • Dynamic AutoML Approaches
  • Interpretable AutoML

What will I achieve?

By the end of the course, you‘ll be able to...

  • identify possible design decisions and procedures in the application of ML.
  • implement several optimization and learning strategies for AutoML.
  • evaluate the design decisions made.

Which prerequisites do I need to fulfill?

  • Basics in Machine Learning (ML) and Deep Learning (DL)
  • First experiences in the application of ML
  • R or Python
  • Optional: Basics of Reinforcement Learning

Course contents

Chapter 1: Overview:

In this very first week, you will learn what AutoML actually is and what kind of problems we will address in
this course.

Chapter 2: Evaluation of Machine Learning Models:

To decide which algorithm, hyperparameter or neural architecture to use a for a given dataset, we first have to
talk about how we can actually determine the best performing model. Therefore, we will talk about how to evaluate
ML in this module.

Chapter 3: Algorithm Selection:

Sometimes we cannot afford to search for the best configurations, but we have to make an educated decision on the
fly. Algorithm selection is a meta-learning approach, where we learn to predict the best performing ML algorithm
for a given dataset.

Chapter 4: Basics of HPO:

For training an ML model on a given dataset, it is often required to tune the hyperparameters of the ML
algorithm to achieve great performance on this dataset. In this module, we give a first introduction into
basic ideas of hyperparameter optimization.

Chapter 5: Gaussian Processes (Exp):

Gaussian Processes (GPs) are a famous class of probabilistic ML models which not only return a mean estimate
but also a variance as an uncertainty measure. Although not directly related to AutoML, it is the basis for
Bayesian Optimization in the following module.

Chapter 6: Bayesian Optimization for HPO:

Bayesian Optimization is one of the most used optimization techniques for AutoML since it can deal with black
box functions and is very sample efficient. In particular for Hyperparameter Optimization with a small tuning
budget and expensive function evaluations, it is often the choice to go.

Chapter 7: Speedup Techniques for Hyperparameter Optimiziation:

Although techniques such as Bayesian Optimization are very sample efficient, sometimes we cannot not even
afford a few evaluated ML models, for example training a large deep neural network. In such cases, it is
important to speed up AutoML by techniques such as evaluating the model after partial training or on subsets
on the dataset.

Chapter 8: Multi-criteria Optimization:

Although we often focus on predictive performance (such as accuracy or RMSE), in some applications other
metrics such as model size or inference time are also important. Since these metrics do not necessarily align
with each other, multi-criteria optimization returns a set of different trade-offs between these metrics.

Chapter 9: Neural Architecture Search I:

One of the biggest challenges in applying deep learning to new datasets and tasks is the design of the
architecture of a deep neural network. The network should not be too shallow or to deep; should have not too
little and not too much capacity; should use the right operators and connections; and so on. Neural
architecture search supports users in finding such architectures automatically.

Chapter 10: Neural Architecture Search II:

By going deeper into neural architecture search (NAS), we will discuss approaches such as one-shot models,
which has all architectures as submodels, or differentiable architecture search, which makes NAS very
efficient. However, also benchmarking of NAS approaches is very important to understand how to compare
different approaches.

Chapter 11: Dynamic Configuration and Learning to Learn:

Some hyperparameter settings are actually not constant for the entire training, but have to be adapted while
the training proceeds, for example the learning rate of a deep neural network. Dynamic configuration
approaches allow to learn policies in an offline learning phase (for example via reinforcement learning) such
that configurations can be adjusted on the fly on new datasets. Going even one step further, "learning to
learn" describes a paradigm where the entire learning process (for example weight updates) are meta-learned.

Chapter 12: Interpretability:

For most users of AutoML, it does not suffice to simply obtain a well-performing ML pipeline, but they would
like to understand why an AutoML process made certain decisions or how the AutoML optimization problem
actually looks like. In this module on interpretability for AutoML, we will focus on understanding this
process.

Chapter 13: Beyond AutoML (optional):

At the end of the day, AutoML is a meta-algorithmic approach and the underlying ideas can be applied to many
other domains if we are interested in optimizing empirical performance. In this module, we will focus on two
important extensions of AutoML: (i) optimization hyperparameters across many tasks and (ii) efficient
optimization of runtime as performance metric.