Research Topics & Interests
Deep Reinforcement Learning
We tackle key challenges in deep reinforcement learning, aiming to make RL systems both robust and efficient. Our research focuses on enabling agents to learn optimal timing for actions, develop a contextual understanding of complex environments, and build predictive world models that improve decision-making.
Automated Reinforcement Learning
We design methods to automate the development and optimization of reinforcement learning systems, making RL more accessible and efficient. Our research emphasizes creating benchmark environments and pioneering dynamic, as well as gray-box optimization approaches. These efforts have led to impactful contributions in the field, including novel benchmarks and optimization techniques that have gained widespread recognition in the AutoML community for advancing ease of use and performance in RL systems.
Dynamic Algorithm Configuration
We established new foundations in the field of Dynamic Algorithm Configuration (DAC), moving beyond static algorithm configuration to enable real-time parameter adaptation. Our work introduced frameworks for algorithms to adjust their hyperparameters while running, leading to significant efficiency improvements. This research has practical applications across various domains, from evolutionary computation to classical optimization problems and reinforcement learning.
Ongoing Projects
Learning Fast and Efficient Hyperparameter Control for Deep Reinforcement Learning on Small Datasets
Deep reinforcement learning (RL) algorithms are a powerful class of methods for optimizing sequential decision making and control problems, and are the driver behind many real-world applications. Yet, they are also very sensitive to their hyperparameters, such as the discount factor or the learning rate. This leads to considerable uncertainty with respect to optimal use. Setting these hyperparameters is a notoriously difficult task. At the same time, it is crucial for obtaining state-of-the-art performance. This translates into multiple costly training runs to find a well-performing agent. The problem is exacerbated in deep RL application scenarios that only allow for small amounts of data to be collected, such as in many biomedical applications. For example in the optimization of drug dosage schedules in cell cultures, the elapsed time between collected data points with meaningful variability is typically very large, such that only few data points can realistically be collected within a given time frame. Consequently, we aim to develop automated approaches for setting the hyperparameters of deep RL approaches in a sample-efficient manner for small datasets, thus reducing uncertainty.
Student: Asif Hasan
Tackling the Primacy Bias in Deep RL
Deep reinforcement learning models tend to overfit to early experiences.There are many potential aspects that could be the root cause. In particular, deep learning tends to greedily follow the learning signal. However, in settings such as reinforcement or even continual learning, this signal changes over time, requiring more plasticity of the learning process than classical supervised settings. This project studies a novel approach to regularization to preserve plasticity in the model and explores the impact in various learning settings.
Student: Philipp Bordne
Towards General Offline RL-Based Dynamic Algorithm Configuration
Traditional methods for Algorithm Configuration (AC) typically predict fixed parameter settings, yet research has shown that dynamically adapting parameters over time can significantly improve performance. This insight has led to the development of many handcrafted heuristics. To better respond to changing optimization dynamics, automatic Dynamic Algorithm Configuration (DAC) has been developed, with most current approaches relying on online Reinforcement Learning (RL). However, online RL is resource-intensive and, due to its need for extensive interactions with the environment, is not feasible for certain domains. Offline RL mitigates these challenges by training on pre-collected datasets, although it introduces new challenges, such as distributional divergence. This work explores the application of offline RL for DAC across various domains. We demonstrate that offline RL can effectively adapt the learning rate for Stochastic Gradient Descent (SGD).
Students: Leon Gieringer & Janis Fix
Modelling Partially Observable Worlds
Model-based RL has shown tremendous sample efficiency and applicability in a broad variety of problem domains. However, typical benchmarks in which these models are being evaluated ignore a crucial aspect of real world scenarios -- partial observability. This work aims at understanding the impact of partial observability on model-based agents and tries to shed light on how world-models of partially observable worlds encode the underlying environment state.
Student: Sai Prasanna
On the Zero-Shot Generalizability of Contextual Offline Reinforcement Learning
Contextual reinforcement learning has shown to be a viable avenue to train generalizable agents by providing meta-information about the environments dynamics (such as the weight of a robot or the gravity of a planet). However, this setting has so far only been studied extensively in the online RL and model-based RL settings. Offline RL promises an alternative in which well performing policies are learned purely from existing datasets. In this work we set out to study the generalization capabilities of offline RL agents, when providing them with context information about the environment at hand.
Student: Rachana Tirumanyam
Open Projects & Thesis Topics
Unfortunately, at the current Moment we can not offer any RL specific projects or thesis. Please refer to https://ml.informatik.uni-freiburg.de/student/ for other open topics.
Members
Postdoctoral Research Fellows
PhD Students
Students
Philipp Bordne
Janis Fix
Leon Gieringer
Sai Prasanna
Rachana Tirumanyam
Alumni
Tidiane Camaret Ndir
Florian Diederichs
Zainab Sultan
Jan Ole von Hartz
Publications
2024 |
One-shot World Models Using a Transformer Trained on a Synthetic Prior Inproceedings In: NeurIPS 2024 Workshop on Open-World Agents, 2024. |
Inferring Behavior-Specific Context Improves Zero-Shot Generalization in Reinforcement Learning Inproceedings In: Seventeenth European Workshop on Reinforcement Learning, 2024. |
CANDID DAC: Leveraging Coupled Action Dimensions with Importance Differences in DAC Inproceedings In: Proceedings of the Third International Conference on Automated Machine Learning (AutoML 2024), Workshop Track, 2024. |
HPO-RL-Bench: A Zero-Cost Benchmark for HPO in Reinforcement Learning Inproceedings In: Proceedings of the Third International Conference on Automated Machine Learning (AutoML 2024), ABCD Track, 2024, (Runner up for the best paper award). |
Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization Journal Article In: Reinforcement Learning Journal, vol. 3, iss. 1, no. 1, pp. 1317–1350, 2024, ISBN: 979-8-218-41163-3. |
Hierarchical Transformers are Efficient Meta-Reinforcement Learners Journal Article In: arXiv:2402.06402 [cs.LG], 2024. |
2023 |
MDP Playground: An Analysis and Debug Testbed for Reinforcement Learning Journal Article In: Journal of Artificial Intelligence Research (JAIR), vol. 77, pp. 821-890, 2023. |
Contextualize Me - The Case for Context in Reinforcement Learning Journal Article In: Transactions on Machine Learning Research, 2023, ISBN: 2835-8856. |
Gray-Box Gaussian Processes for Automated Reinforcement Learning Inproceedings In: Eleventh International Conference on Learning Representations (ICLR'23), 2023. |
2022 |
AutoRL-Bench 1.0 Inproceedings In: Workshop on Meta-Learning (MetaLearn@NeurIPS'22), 2022. |
Gray-Box Gaussian Processes for Automated Reinforcement Learning Inproceedings In: Workshop on Meta-Learning (MetaLearn@NeurIPS'22), 2022. |
Dynamic Algorithm Configuration by Reinforcement Learning PhD Thesis University of Freiburg, Department of Computer Science, 2022. |
DeepCAVE: An Interactive Analysis Tool for Automated Machine Learning Inproceedings In: Workshop on Adaptive Experimental Design and Active Learning in the Real World (ReALML@ICML'22), 2022. |
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration Inproceedings In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'22), 2022, (Won the best paper award in the GECH track). |
Learning Domain-Independent Policies for Open List Selection Inproceedings In: Workshop on Bridging the Gap Between AI Planning and Reinforcement Learning (PRL @ ICAPS'22), 2022. |
Automated Reinforcement Learning (AutoRL): A Survey and Open Problems Journal Article In: Journal of Artificial Intelligence Research (JAIR), vol. 74, pp. 517-568, 2022. |
Contextualize Me – The Case for Context in Reinforcement Learning Journal Article In: arXiv:2202.04500, 2022. |