Menu

Reinforcement Learning

Research Topics & Interests

Deep Reinforcement Learning

We tackle key challenges in deep reinforcement learning, aiming to make RL systems both robust and efficient. Our research focuses on enabling agents to learn optimal timing for actions, develop a contextual understanding of complex environments, and build predictive world models that improve decision-making.

Automated Reinforcement Learning

We design methods to automate the development and optimization of reinforcement learning systems, making RL more accessible and efficient. Our research emphasizes creating benchmark environments and pioneering dynamic, as well as gray-box optimization approaches. These efforts have led to impactful contributions in the field, including novel benchmarks and optimization techniques that have gained widespread recognition in the AutoML community for advancing ease of use and performance in RL systems.

Dynamic Algorithm Configuration

We established new foundations in the field of Dynamic Algorithm Configuration (DAC), moving beyond static algorithm configuration to enable real-time parameter adaptation. Our work introduced frameworks for algorithms to adjust their hyperparameters while running, leading to significant efficiency improvements. This research has practical applications across various domains, from evolutionary computation to classical optimization problems and reinforcement learning.

Ongoing Projects

Learning Fast and Efficient Hyperparameter Control for Deep Reinforcement Learning on Small Datasets

Deep reinforcement learning (RL) algorithms are a powerful class of methods for optimizing sequential decision making and control problems, and are the driver behind many real-world applications. Yet, they are also very sensitive to their hyperparameters, such as the discount factor or the learning rate. This leads to considerable uncertainty with respect to optimal use. Setting these hyperparameters is a notoriously difficult task. At the same time, it is crucial for obtaining state-of-the-art performance. This translates into multiple costly training runs to find a well-performing agent. The problem is exacerbated in deep RL application scenarios that only allow for small amounts of data to be collected, such as in many biomedical applications. For example in the optimization of drug dosage schedules in cell cultures, the elapsed time between collected data points with meaningful variability is typically very large, such that only few data points can realistically be collected within a given time frame. Consequently, we aim to develop automated approaches for setting the hyperparameters of deep RL approaches in a sample-efficient manner for small datasets, thus reducing uncertainty.

Student: Asif Hasan

Tackling the Primacy Bias in Deep RL

Deep reinforcement learning models tend to overfit to early experiences.There are many potential aspects that could be the root cause. In particular, deep learning tends to greedily follow the learning signal. However, in settings such as reinforcement or even continual learning, this signal changes over time, requiring more plasticity of the learning process than classical supervised settings. This project studies a novel approach to regularization to preserve plasticity in the model and explores the impact in various learning settings.

Student: Philipp Bordne

Towards General Offline RL-Based Dynamic Algorithm Configuration

Traditional methods for Algorithm Configuration (AC) typically predict fixed parameter settings, yet research has shown that dynamically adapting parameters over time can significantly improve performance. This insight has led to the development of many handcrafted heuristics. To better respond to changing optimization dynamics, automatic Dynamic Algorithm Configuration (DAC) has been developed, with most current approaches relying on online Reinforcement Learning (RL). However, online RL is resource-intensive and, due to its need for extensive interactions with the environment, is not feasible for certain domains. Offline RL mitigates these challenges by training on pre-collected datasets, although it introduces new challenges, such as distributional divergence. This work explores the application of offline RL for DAC across various domains. We demonstrate that offline RL can effectively adapt the learning rate for Stochastic Gradient Descent (SGD).

Students: Leon Gieringer & Janis Fix

Modelling Partially Observable Worlds

Model-based RL has shown tremendous sample efficiency and applicability in a broad variety of problem domains. However, typical benchmarks in which these models are being evaluated ignore a crucial aspect of real world scenarios -- partial observability. This work aims at understanding the impact of partial observability on model-based agents and tries to shed light on how world-models of partially observable worlds encode the underlying environment state.

Student: Sai Prasanna

On the Zero-Shot Generalizability of Contextual Offline Reinforcement Learning

Contextual reinforcement learning has shown to be a viable avenue to train generalizable agents by providing meta-information about the environments dynamics (such as the weight of a robot or the gravity of a planet). However, this setting has so far only been studied extensively in the online RL and model-based RL settings. Offline RL promises an alternative in which well performing policies are learned purely from existing datasets. In this work we set out to study the generalization capabilities of offline RL agents, when providing them with context information about the environment at hand.

Student: Rachana Tirumanyam

Open Projects & Thesis Topics

Unfortunately, at the current Moment we can not offer any RL specific projects or thesis. Please refer to https://ml.informatik.uni-freiburg.de/student/ for other open topics.

Members

Postdoctoral Research Fellows

PhD Students

Students

Philipp Bordne

Janis Fix

Leon Gieringer

Sai Prasanna

Rachana Tirumanyam

Alumni

Tidiane Camaret Ndir

Florian Diederichs

Zainab Sultan

Jan Ole von Hartz

Publications

2024

Ferreira, Fabio; Schlageter, Moreno; Rajan, Raghu; Biedenkapp, André; Hutter, Frank

One-shot World Models Using a Transformer Trained on a Synthetic Prior Inproceedings

In: NeurIPS 2024 Workshop on Open-World Agents, 2024.

Ndir, Tidiane Camaret; Biedenkapp, André; Awad, Noor

Inferring Behavior-Specific Context Improves Zero-Shot Generalization in Reinforcement Learning Inproceedings

In: Seventeenth European Workshop on Reinforcement Learning, 2024.

Bordne, Philipp; Hasan, M. Asif; Bergman, Eddie; Awad, Noor; Biedenkapp, André

CANDID DAC: Leveraging Coupled Action Dimensions with Importance Differences in DAC Inproceedings

In: Proceedings of the Third International Conference on Automated Machine Learning (AutoML 2024), Workshop Track, 2024.

Shala, Gresa; Arango, Sebastian Pineda; Biedenkapp, André; Hutter, Frank; Grabocka, Josif

HPO-RL-Bench: A Zero-Cost Benchmark for HPO in Reinforcement Learning Inproceedings

In: Proceedings of the Third International Conference on Automated Machine Learning (AutoML 2024), ABCD Track, 2024, (Runner up for the best paper award).

Prasanna, Sai; Farid, Karim; Rajan, Raghu; Biedenkapp, André

Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization Journal Article

In: Reinforcement Learning Journal, vol. 3, iss. 1, no. 1, pp. 1317–1350, 2024, ISBN: 979-8-218-41163-3.

Shala, Gresa; Biedenkapp, André; Grabocka, Josif

Hierarchical Transformers are Efficient Meta-Reinforcement Learners Journal Article

In: arXiv:2402.06402 [cs.LG], 2024.

2023

Rajan, Raghu; Diaz, Jessica Lizeth Borja; Guttikonda, Suresh; Ferreira, Fabio; Biedenkapp, André; von Hartz, Jan Ole; Hutter, Frank

MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning Journal Article

In: Journal of Artificial Intelligence Research (JAIR), vol. 77, pp. 821-890, 2023.

Benjamins, Carolin; Eimer, Theresa; Schubert, Frederik; Mohan, Aditya; Döhler, Sebastian; Biedenkapp, André; Rosenhan, Bodo; Hutter, Frank; Lindauer, Marius

Contextualize Me - The Case for Context in Reinforcement Learning Journal Article

In: Transactions on Machine Learning Research, 2023, ISBN: 2835-8856.

Shala, Gresa; Biedenkapp, André; Hutter, Frank; Grabocka, Josif

Gray-Box Gaussian Processes for Automated Reinforcement Learning Inproceedings

In: Eleventh International Conference on Learning Representations (ICLR'23), 2023.

2022

Shala, Gresa; Arango, Sebastian Pineda; Biedenkapp, André; Hutter, Frank; Grabocka, Josif

AutoRL-Bench 1.0 Inproceedings

In: Workshop on Meta-Learning (MetaLearn@NeurIPS'22), 2022.

Shala, Gresa; Biedenkapp, André; Hutter, Frank; Grabocka, Josif

Gray-Box Gaussian Processes for Automated Reinforcement Learning Inproceedings

In: Workshop on Meta-Learning (MetaLearn@NeurIPS'22), 2022.

Biedenkapp, André

Dynamic Algorithm Configuration by Reinforcement Learning PhD Thesis

University of Freiburg, Department of Computer Science, 2022.

Sass, René; Bergman, Eddie; Biedenkapp, André; Hutter, Frank; Lindauer, Marius

DeepCAVE: An Interactive Analysis Tool for Automated Machine Learning Inproceedings

In: Workshop on Adaptive Experimental Design and Active Learning in the Real World (ReALML@ICML'22), 2022.

Biedenkapp, André; Dang, Nguyen; Krejca, Martin S.; Hutter, Frank; Doerr, Carola

Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration Inproceedings

In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'22), 2022, (Won the best paper award in the GECH track).

Biedenkapp, André; Speck, David; Sievers, Silvan; Hutter, Frank; Lindauer, Marius; Seipp, Jendrik

Learning Domain-Independent Policies for Open List Selection Inproceedings

In: Workshop on Bridging the Gap Between AI Planning and Reinforcement Learning (PRL @ ICAPS'22), 2022.

Parker-Holder, Jack; Rajan, Raghu; Song, Xingyou; Biedenkapp, André; Miao, Yingjie; Eimer, Theresa; Zhang, Baohe; Nguyen, Vu; Calandra, Roberto; Faust, Aleksandra; Hutter, Frank; Lindauer, Marius

Automated Reinforcement Learning (AutoRL): A Survey and Open Problems Journal Article

In: Journal of Artificial Intelligence Research (JAIR), vol. 74, pp. 517-568, 2022.

Benjamins, Carolin; Eimer, Theresa; Schubert, Frederik; Mohan, Aditya; Biedenkapp, André; Rosenhan, Bodo; Hutter, Frank; Lindauer, Marius

Contextualize Me – The Case for Context in Reinforcement Learning Journal Article

In: arXiv:2202.04500, 2022.