Student Projects ‒ ML4ED ‐ EPFL

On this page you can find our offerings for Master’s Projects and Master and Bachelor Research Projects in the realm of data mining and machine learning for (vocational) education ecosystems for the autumn semester 2026. Please note that this list of projects is not exhaustive and will be updated over the coming days with further exciting projects.

Last update: 29.04.2026

How to apply

Please apply via our student project application form. You will need to specify which project(s) you are interested in, why you are interested and if you have any relevant experience in this area. To access the form, you need to log in with your EPFL email address. If you would like to receive more information on our research, do not hesitate to contact us! Students who are interested in doing a project are encouraged to have a look at the Thesis & Project Guidelines, where you will gain an understanding about what can be expected of us and what we expect from students.

The student project application form will remain open for submissions until the late deadline of 15 August 2026. Applications will be reviewed on a on-going basis, so we strongly encourage you to apply as soon as possible. Early applicants will be considered starting in calendar week 21, with the early deadline set for 14 May 2026:

Early deadline (first round): 14.05.2026
First contact with supervisors: 18.05.2026 – 24.05.2026
Late deadline: 15.08.2026

External students: Non EPFL students are kindly requested to get in touch with the project(s) supervisors by e-mail.

Please, note that the list of topics below is not exclusive. In case you have other ideas or proposals, you are welcome to contact a senior member of the lab and talk about possibilities for a tailor-made topic.

Project 1: Hierarchical Multi-Agent Systems for Simulated Learners in Inquiry-Based Environments

Large Language Models(LLMs) are increasingly used as simulated learners to support the development and evaluation of educational technologies. A central challenge is aligning agent behaviour with authentic student behaviour in inquiry-based environments such as Beer’s Law Lab, a virtual lab where students investigate Beer’s Law by varying solution characteristics and observing light absorbance. Here the action space is continuous, which amplifies the data problem since the space of plausible behaviours is vast and only sparsely covered by available student traces. Prior work on LLM-based simulated learners has not addressed design of architectures for handling these continuous action spaces.

Objectives:

Design a hierarchical multi-agent system consisting of small specialised models for individual environment components (e.g., preparing solutions, setting path length, recording measurements), coordinated by a higher-level orchestrator agent.
Investigate whether decomposing student behaviour into component-level skills improves behavioural alignment with real student data compared to a single LLM agent baseline.
Determine whether fine-tuning is required for lower-level skill models or whether in-context learning suffices, and identify effective data sources for training.
Evaluate the full system against a single-agent baseline on alignment metrics (behavioural similarity to held-out student traces) and task-level outcomes.

Requirements:

Interest in: Multi-Agent Systems, Reinforcement Learning, NLP, Educational Technology
Skills: Python, basic Reinforcement Learning, familiarity with Hugging Face TRL

Level: Master

Supervision: Bahar Radmehr (PhD student)

Project 2: Are Students Lazy Experts? Reframing LLM Alignment as Learning When and How to Depart from the Default Policy

Large Language Models(LLMs)used as simulated learners must be aligned not just to a generic “student” but to specific behavioural profiles capturing the diversity of real student behaviour. Standard alignment approaches such as DPO and GRPO adjust the policy globally and are constrained by a KL penalty against a reference model. This works well when the target profile is close to the model’s default, but struggles with distant profiles, such as students who perform weak or cluttered explorations, or who progress unusually slowly, which are exactly those whose presence is essential for a realistic simulated-learner population. Despite this need, no alignment method has been designed with simulated learners in mind or offers explicit control over where and how the policy should deviate from its default behaviour.

Objectives:

Develop an alignment method that keeps the default policy intact and learns a lightweight gating policy that decides, at each step, whether to act from the default or from a profile-specific deviation head.
Investigate whether selective deviation improves profile recovery on distant profiles compared to standard alignment approaches under matched compute.
Design and evaluate a deviation budget parameterised as a function of profile distance, exploring whether this relationship can be learned from limited student data rather than hand-tuned.
Assess whether selective deviation preserves default-policy behaviour on profile-irrelevant aspects (e.g., valid action generation, format adherence), reducing the capability degradation typically seen when lowering the KL penalty.

Requirements:

Interest in: LLM Alignment, Reinforcement Learning, Simulated Learners, Educational Technology,
Skills: Python, basic RL, familiarity with Hugging Face TRL

Level: Master

Supervision: Bahar Radmehr (PhD student)

Project 3: Disentangling Prompt and Knowledge Uncertainty in LLM Reasoning for Interactive Feedback

Large Language Models (LLMs) are increasingly used to provide students with feedback and support follow-up interaction, but they can produce fluent responses that are incorrect, misleading, or based on flawed reasoning. This uncertainty may come from an ambiguous or underspecified prompt, also known as aleatoric uncertainty, or from missing knowledge, also known as epistemic uncertainty. However, current uncertainty-estimation methods rarely examine where uncertainty appears inside the reasoning traces of thinking models. The goal of this project is to adapt uncertainty-estimation methods to LLM reasoning traces in student-facing feedback systems, identify uncertain reasoning steps, distinguish prompt-related from knowledge-related uncertainty, and communicate this information to learners to support more informed use and appropriate trust in the model’s output.

Objectives:

Develop a controlled evaluation setup that uses variations of student questions, feedback prompts, and available course context to separate prompt-related uncertainty from knowledge-related uncertainty in reasoning traces.
Adapt existing uncertainty methods to reasoning steps, using signals such as token entropy, top-token margin, and hidden-state features.
Build lightweight predictors that estimate the uncertainty type and provide a test-time uncertainty score.
Evaluate whether targeted interventions, such as structured prompting or added evidence, improve feedback quality and interactive response correctness.
Explore how to communicate uncertainty to students through confidence levels, likely causes, and suggested next actions.

Requirements:

Interest in: LLM reasoning, uncertainty estimation, educational technology

Skills: Python, machine learning, familiarity with Hugging Face and PyTorch

Level: Master

Supervision: Fares Fawzi (PhD Student)

Project 4: Fading AI Scaffolding for Reflective Writing Skill Transfer

AI-supported tools for reflective writing previously developed and evaluated in the lab, such as Pensée, MindMate, and Reflectium, have demonstrated clear benefits in improving the depth and structure of learners’ reflections during use. However, these gains often fail to transfer once the support is removed, with learners reverting to near baseline performance. This limitation highlights a key design gap: current systems treat scaffolding as a static or binary feature rather than a temporary support that should gradually diminish as learners gain competence. While the concept of fading support has been discussed in adjacent research, it has not yet been systematically designed or evaluated in the context of reflective writing, where metacognitive skill development requires careful transition to independence.The goal of this project is to design and study a system that gradually decreases the level of AI support during reflective writing, with the aim of promoting durable skill transfer. Building on existing conversational agent–based frameworks (e.g., cognitive process theory–based scaffolding), the project will investigate how learners adapt to decreasing levels of support and what design features enable a successful transition to independent reflection.

Objectives:

Design a reflective writing system that implements staged reduction of AI support (e.g., from full conversational scaffolding to minimal prompts to no support).
Investigate how learners perceive and respond to the gradual withdrawal of scaffolding, including the strategies they develop to maintain reflection quality.
Examine whether fading support increases learners’ metacognitive awareness and sense of ownership over their reflections.
Develop and evaluate NLP models to be able to provide reflective writing support.
Conduct a user study (likely qualitative, e.g., multi-session use with think-aloud protocols and interviews) to inform the design of future large-scale classroom evaluations.

Requirements:

Interest in: LLMs, writing support with AI, human-computer interaction, educational technology

Skills: NLP and machine learning (knowing front-end is a bonus)

Level: Master

Project 5: In-the-Moment Capture for Enhancing Reflective Writing

Reflective writing in vocational and experiential learning contexts typically relies on late retrospective recall, often hours or days after an event. This delay introduces well-documented limitations, including memory decay and reconstruction biases, which reduce the richness and accuracy of reflections. While prior work in vocational education emphasizes the importance of reflection-in-action alongside reflection-on-action, existing AI-supported reflection tools focus almost exclusively on the latter. As a result, learners miss the opportunity to capture immediate observations, emotions, and questions that could serve as valuable input for later reflection.This project addresses this gap by designing a lightweight system for in-situ capture of learner experiences and integrating these captures into subsequent reflective writing. By bridging real-time experience capture with structured reflection, the project explores how grounding reflections in authentic, moment-level data affects both the writing process and its outcomes.

Objectives:

Design a tool that enables learners to capture brief in-situ inputs (e.g., text, voice, or other modalities) during learning experiences.
Integrate captured data into a structured reflective writing interface, augmented with AI-based support.
Analyze how learners engage with the capture tool, including the types of inputs they produce (e.g., observational, emotional, or question-based).
Develop and evaluate NLP models to be able to provide reflective writing support.
Conduct qualitative or mixed-method studies (depending on access and classroom availabliity) to inform future controlled experiments in authentic educational settings.