Supervision | Rafael Fernandes Cunha

Master Thesis Supervision — 6 Students

Date	Student	Thesis	Research Output
Aug 2025	Thomas Vos Master in AI	Learning the optimal policy for Hanabi using deep reinforcement learning with quantized PWLC Co-supervised with Jilles Dibangoye and Matthia Sabatelli
Mar 2025	Hugo Kolste Master in AI	Scaling Up Centralized Multi-Agent Reinforcement Learning with Agent-by-Agent Optimization Co-supervised with Jilles Dibangoye
Feb 2025	Fatemeh Ziad Alizadeh Master in AI	Don't Focus on your Weakness, Use your Strengths: A Multi-Agent Approach for Multi-Hop Question-Answering Tasks for Large Language Models Co-supervised with Tsegaye Tashu
Jan 2025	Luca Mueller Master in AI	Formalizing Coverage-guided Greybox Fuzzing with Deep Reinforcement Learning Co-supervised with Fatih Turkmen	ECAI 2025 (SPAIML)
Dec 2024	Davide Rigone Master in CS	Collaborative Reinforcement Learning for Cyber Defense: Analysis of Environments, Strategies, and Policies Co-supervised with Fatih Turkmen	ECAI 2025 (SPAIML)
Nov 2022	Lars T. G. Mulder Master in IEM	Applying Fast Multi-agent Reinforcement Learning with Generalized Policy Updates Co-supervised with Ming Cao

Bachelor Thesis Supervision — 20 Students

Date	Student	Thesis	Research Output
Sep 2025	Kristaps Melbardis Bachelor in AI	Enhancing Long-Context Understanding in Language Models via Titans Neural Long-Term Memory: A Case Study with Qwen, and the Babilong Dataset
Sep 2025	Manos Savvides Bachelor in CS	Multi-Agent Reinforcement Learning for Cyber Defence Co-supervised with Fatih Turkmen
Aug 2025	Benediktus Firstian Pradipta Bachelor in AI	Shaping Reasoning Through Rewards: Investigating Reward Structures in Post-Training LLMs with Pure Reinforcement Learning
Aug 2025	Ravindra A. Tarunokusumo Bachelor in AI	Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning Co-supervised with T.M. Tashu	ECAI 2025 (sLLM)
Aug 2025	Andjela Matic Bachelor in AI	Outperforming the Baseline: Transfer Learning in Atari via Parallelized Q-Networks
Jul 2025	Stan Ferguson Bachelor in AI	Exploring One-Step Fixed Horizon Q-learning in Tabular Stochastic Environments
Jun 2025	Quinten Steringa Bachelor in AI	No Supervision, No Problem: Pure Reinforcement Learning Improves Mathematical Reasoning in Small Language Models	ECAI 2025 (sLLM)
Apr 2025	Rares Stefan Stoian Bachelor in AI	Accelerating Model Based Reinforcement Learning Using GPU Through Parallelization of Dyna-Q Architecture Co-supervised with Matthia Sabatelli
Feb 2025	Leon Tanis Bachelor in AI	Bridging Faithfulness of Explanations and Deep Reinforcement Learning: A Grad-CAM Analysis of Space Invaders Co-supervised with Marco Zullich	FDG 2025
Jan 2025	Catalin Zaharia Bachelor in AI	Transfer Learning in Reinforcement Learning: When Task-Specific Adaptation Outperforms Generalization Co-supervised with Matthia Sabatelli
Aug 2024	Andre van Dommele Bachelor in AI	Enhancing Football Simulation Performance in Deep Reinforcement Learning Through Analytics-based Dense Reward Shaping
Aug 2024	Niclas Müller-Horf Bachelor in AI	Improving Efficiency of a Hierarchical Reinforcement Learning Algorithm
Aug 2024	Jeremias Lino Ferrao Bachelor in AI	World Model Agents with Changed-Based Intrinsic Motivation	NLDL 2025
Aug 2024	Diana-Maria Arapu Bachelor in AI	Sparse Rewards Reinforcement Learning: Addressing Vanishing Intrinsic Rewards in Change-Based Exploration Transfer
Jul 2024	Matej Priesol Bachelor in AI	Forecasting Carbon Intensity and Solar Generation in the Building Sector Co-supervised with J.D. Cardenas Cartagena
Mar 2024	Peter van den Bempt Bachelor in AI	Investigating Mode-Switching and Reward Stream Separation in Hard-Exploration Problems Co-supervised with Matthia Sabatelli
Jul 2021	Bo T. Kroezen Bachelor in IEM	Stochastic Stability Analysis of Selection-Mutation Processes and Signaling Games Co-supervised with Ming Cao
Feb 2021	Tautas Hoedtke Bachelor in IEM	Safe Reinforcement Learning Co-supervised with Ming Cao
Jul 2020	Muhammad Aqil Prasetyo Bachelor in IEM	Using Reinforcement Learning to Design a State Feedback Controller Co-supervised with Ming Cao
Feb 2020	Martijn van Dis Bachelor in IEM	The Performance of Reinforcement Learning with Application for Adaptive Traffic Signal Controllers Co-supervised with Ming Cao