Student Projects

On this page you can find our offerings for Master’s Projects and Master and Bachelor Research Projects in the realm of data mining and machine learning for (vocational) education ecosystems for the autumn semester 2026. Please note that this list of projects is not exhaustive and will be updated over the coming days with further exciting projects.

Last update: 11.05.2026

How to apply

Please apply via our student project application form. You will need to specify which project(s) you are interested in, why you are interested and if you have any relevant experience in this area. To access the form, you need to log in with your EPFL email address. If you would like to receive more information on our research, do not hesitate to contact us! Students who are interested in doing a project are encouraged to have a look at the Thesis & Project Guidelines, where you will gain an understanding about what can be expected of us and what we expect from students.

The student project application form will remain open for submissions until the late deadline of 15 August 2026. Applications will be reviewed on a on-going basis, so we strongly encourage you to apply as soon as possible. Early applicants will be considered starting in calendar week 21, with the early deadline set for 14 May 2026:

  • Early deadline (first round): 14.05.2026
  • First contact with supervisors: 18.05.2026 – 24.05.2026
  • Late deadline: 15.08.2026

External students: Non EPFL students are kindly requested to get in touch with the project(s) supervisors by e-mail.

Please, note that the list of topics below is not exclusive. In case you have other ideas or proposals, you are welcome to contact a senior member of the lab and talk about possibilities for a tailor-made topic.

Project 1: Hierarchical Multi-Agent Systems for Simulated Learners in Inquiry-Based Environments

Large Language Models(LLMs) are increasingly used as simulated learners to support the development and evaluation of educational technologies. A central challenge is aligning agent behaviour with authentic student behaviour in inquiry-based environments such as Beer’s Law Lab, a virtual lab where students investigate Beer’s Law by varying solution characteristics and observing light absorbance. Here the action space is continuous, which amplifies the data problem since the space of plausible behaviours is vast and only sparsely covered by available student traces. Prior work on LLM-based simulated learners has not addressed design of architectures for handling these continuous action spaces.

Objectives:

  • Design a hierarchical multi-agent system consisting of small specialised models for individual environment components (e.g., preparing solutions, setting path length, recording measurements), coordinated by a higher-level orchestrator agent.
  • Investigate whether decomposing student behaviour into component-level skills improves behavioural alignment with real student data compared to a single LLM agent baseline.
  • Determine whether fine-tuning is required for lower-level skill models or whether in-context learning suffices, and identify effective data sources for training.
  • Evaluate the full system against a single-agent baseline on alignment metrics (behavioural similarity to held-out student traces) and task-level outcomes.

Requirements:

  • Interest in: Multi-Agent Systems, Reinforcement Learning, NLP, Educational Technology
  • Skills: Python, basic Reinforcement Learning, familiarity with Hugging Face TRL

Level: Master

Supervision: Bahar Radmehr (PhD student)

Project 2: Are Students Lazy Experts? Reframing LLM Alignment as Learning When and How to Depart from the Default Policy

Large Language Models(LLMs)used as simulated learners must be aligned not just to a generic “student” but to specific behavioural profiles capturing the diversity of real student behaviour. Standard alignment approaches such as DPO and GRPO adjust the policy globally and are constrained by a KL penalty against a reference model. This works well when the target profile is close to the model’s default, but struggles with distant profiles, such as students who perform weak or cluttered explorations, or who progress unusually slowly, which are exactly those whose presence is essential for a realistic simulated-learner population. Despite this need, no alignment method has been designed with simulated learners in mind or offers explicit control over where and how the policy should deviate from its default behaviour.

Objectives:

  • Develop an alignment method that keeps the default policy intact and learns a lightweight gating policy that decides, at each step, whether to act from the default or from a profile-specific deviation head.
  • Investigate whether selective deviation improves profile recovery on distant profiles compared to standard alignment approaches under matched compute.
  • Design and evaluate a deviation budget parameterised as a function of profile distance, exploring whether this relationship can be learned from limited student data rather than hand-tuned.
  • Assess whether selective deviation preserves default-policy behaviour on profile-irrelevant aspects (e.g., valid action generation, format adherence), reducing the capability degradation typically seen when lowering the KL penalty.

Requirements:

  • Interest in: LLM Alignment, Reinforcement Learning, Simulated Learners, Educational Technology, 
  • Skills: Python, basic RL, familiarity with Hugging Face TRL

Level: Master

Supervision: Bahar Radmehr (PhD student)

Project 3: Disentangling Prompt and Knowledge Uncertainty in LLM Reasoning for Interactive Feedback

Large Language Models (LLMs) are increasingly used to provide students with feedback and support follow-up interaction, but they can produce fluent responses that are incorrect, misleading, or based on flawed reasoning. This uncertainty may come from an ambiguous or underspecified prompt, also known as aleatoric uncertainty, or from missing knowledge, also known as epistemic uncertainty. However, current uncertainty-estimation methods rarely examine where uncertainty appears inside the reasoning traces of thinking models. The goal of this project is to adapt uncertainty-estimation methods to LLM reasoning traces in student-facing feedback systems, identify uncertain reasoning steps, distinguish prompt-related from knowledge-related uncertainty, and communicate this information to learners to support more informed use and appropriate trust in the model’s output.

Objectives:

  • Develop a controlled evaluation setup that uses variations of student questions, feedback prompts, and available course context to separate prompt-related uncertainty from knowledge-related uncertainty in reasoning traces.
  • Adapt existing uncertainty methods to reasoning steps, using signals such as token entropy, top-token margin, and hidden-state features.
  • Build lightweight predictors that estimate the uncertainty type and provide a test-time uncertainty score.
  • Evaluate whether targeted interventions, such as structured prompting or added evidence, improve feedback quality and interactive response correctness.
  • Explore how to communicate uncertainty to students through confidence levels, likely causes, and suggested next actions.
Requirements:
Interest in: LLM reasoning, uncertainty estimation, educational technology
Skills: Python, machine learning, familiarity with Hugging Face and PyTorch
Level: Master
Supervision: Fares Fawzi (PhD Student)

Project 4: Fading AI Scaffolding for Reflective Writing Skill Transfer

AI-supported tools for reflective writing previously developed and evaluated in the lab, such as Pensée, MindMate, and Reflectium, have demonstrated clear benefits in improving the depth and structure of learners’ reflections during use. However, these gains often fail to transfer once the support is removed, with learners reverting to near baseline performance. This limitation highlights a key design gap: current systems treat scaffolding as a static or binary feature rather than a temporary support that should gradually diminish as learners gain competence. While the concept of fading support has been discussed in adjacent research, it has not yet been systematically designed or evaluated in the context of reflective writing, where metacognitive skill development requires careful transition to independence.The goal of this project is to design and study a system that gradually decreases the level of AI support during reflective writing, with the aim of promoting durable skill transfer. Building on existing conversational agent–based frameworks (e.g., cognitive process theory–based scaffolding), the project will investigate how learners adapt to decreasing levels of support and what design features enable a successful transition to independent reflection.

Objectives:
  • Design a reflective writing system that implements staged reduction of AI support (e.g., from full conversational scaffolding to minimal prompts to no support).
  • Investigate how learners perceive and respond to the gradual withdrawal of scaffolding, including the strategies they develop to maintain reflection quality.
  • Examine whether fading support increases learners’ metacognitive awareness and sense of ownership over their reflections.
  • Develop and evaluate NLP models to be able to provide reflective writing support.
  • Conduct a user study (likely qualitative, e.g., multi-session use with think-aloud protocols and interviews) to inform the design of future large-scale classroom evaluations.

Requirements:

Interest in: LLMs, writing support with AI, human-computer interaction, educational technology

Skills: NLP and machine learning (knowing front-end is a bonus)

Level: Master

Supervision: Seyed Parsa Neshaei (PhD Student)

Project 5: In-the-Moment Capture for Enhancing Reflective Writing

Reflective writing in vocational and experiential learning contexts typically relies on late retrospective recall, often hours or days after an event. This delay introduces well-documented limitations, including memory decay and reconstruction biases, which reduce the richness and accuracy of reflections. While prior work in vocational education emphasizes the importance of reflection-in-action alongside reflection-on-action, existing AI-supported reflection tools focus almost exclusively on the latter. As a result, learners miss the opportunity to capture immediate observations, emotions, and questions that could serve as valuable input for later reflection.This project addresses this gap by designing a lightweight system for in-situ capture of learner experiences and integrating these captures into subsequent reflective writing. By bridging real-time experience capture with structured reflection, the project explores how grounding reflections in authentic, moment-level data affects both the writing process and its outcomes.
Objectives:
  • Design a tool that enables learners to capture brief in-situ inputs (e.g., text, voice, or other modalities) during learning experiences.
  • Integrate captured data into a structured reflective writing interface, augmented with AI-based support.
  • Analyze how learners engage with the capture tool, including the types of inputs they produce (e.g., observational, emotional, or question-based).
  • Develop and evaluate NLP models to be able to provide reflective writing support.
  • Conduct qualitative or mixed-method studies (depending on access and classroom availabliity) to inform future controlled experiments in authentic educational settings.

Requirements:

Interest in: LLMs, writing support with AI, human-computer interaction, educational technology

Skills: NLP and machine learning (knowing front-end is a bonus)

Level: Master

Supervision: Seyed Parsa Neshaei (PhD Student)

Project 6: Human-in-the-Loop Scenario Authoring for Diagnostic Reasoning in PharmaSim

Generative AI makes it easy to create scenario-based learning (SBL) experiences, but fully AI-generated scenarios often lack pedagogical control, transparency, and alignment with learning goals. This project explores how teachers can be supported in authoring diagnostic reasoning scenarios through a structured human-in-the-loop workflow rather than one-shot generation.The project proposes a PharmaSim-specific scenario authoring tool that enables pharmacy teachers to create diagnostic client cases, including client profiles, symptoms, possible causes, pedagogical strategies, and transfer scenarios. The system guides educators through pedagogically meaningful authoring steps while exposing intermediate representations that can be inspected, refined, and validated.

Objectives:

  1. Design a scenario authoring workflow for creating diagnostic reasoning scenarios in vocational pharmacy education.
  2. Develop structured authoring components for:
    • client cases and contextual information
    • possible causes and likelihoods
    • pedagogical scaffolding strategies
    • transfer scenarios
    • interaction and information-release logic
  3. Implement AI-supported generation of intermediate scenario representations that teachers can iteratively refine.
  4. Support multiple pharmacist-agent pedagogical styles, including structuring, problematizing, hybrid, and no-scaffolding modes.
  5. Investigate whether structured human-in-the-loop authoring improves pedagogical alignment, controllability, and diversity of generated scenarios.

Requirements:

Interest in: Generative AI, Human-Computer Interaction, Educational Technology, Scenario-Based Learning

Skills: Unity and C#, familiarity with LLMs or prompt engineering, experience with UI/UX implementation or conversational systems is a bonus

Level: Master

Supervision: Fatma Betül Güreş (PhD student)

Project 7 : Exploring reasoning strategies for LLM-based problem generation

LLMs are increasingly used for educational content generation, yet the quality of generated exam problems strongly depends on the reasoning strategy used during prompting. Prior work has explored techniques such as chain-of-thought and reflective prompting for improving reasoning performance, but little attention has been given to how these strategies affect the quality of generated assessment items. In particular, it remains unclear whether approaches such as direct prompting, iterative refinement, or structured reasoning produce questions that are more diverse, creative, or appropriately difficult.
 
This project investigates the impact of different reasoning strategies on LLM-based exam problem generation. Multiple prompting approaches / agentic architectures will be compared across common educational tasks, and the generated questions will be evaluated using metrics such as diversity, correctness, creativity, and difficulty calibration. The goal is to identify which reasoning strategies are most effective for generating high-quality educational assessment content.
 
Objectives: Compare reasoning strategies for LLM-based exam question generation. Evaluate generated problems across quality dimensions (correctness, difficulty etc.). Analyze suitability of reasoning approaches for different problem types. Develop an evaluation framework for AI-generated problems.
 
Requirements:
 
Interest in: NLP, agents, evaluation, reasoning
 
Skills: ML/NLP libraries, LLM training and evaluation
 
Bonus: Research experience in NLP, agent-based architectures
 
Level: Master
 
Supervision: Marta Knežević (PhD student)

Project 8: Cross-Lingual On-Policy Self-Distillation for Educational Tasks

Large Language Models (LLMs) used in educational contexts—such as providing feedback, identifying misconceptions, and interactive tutoring—show significant performance gaps between high-resource (e.g., English) and low-resource languages. Existing cross-lingual transfer methods focus on reasoning or general NLU tasks, while educational tasks (tutoring, feedback generation, explanation) remain understudied. Educational tasks present unique challenges: multiple acceptable responses, cultural appropriateness, and pedagogical scaffolding—properties that standard methods for reasoning transfer do not address. Despite this need, no alignment or distillation method has been designed with educational tasks in low-resource languages in mind or offers explicit control over where and how the model should deviate from high-resource behavioral norms.

Objectives:

Implement two variants of on-policy self-distillation for pedagogical transfer. Variant A (Language-as-Privilege): Student generates low-resource output; teacher = same model conditioned on English input. Compute token-level advantage from log-probability differences. Variant B (Feedback Self-Distillation): Student generates low-resource output; model generates English feedback evaluating that output; teacher conditions on input + English feedback. Develop entropy-aware divergence weighting (forward KL for uncertain tokens, reverse KL for confident tokens) that respects pedagogical flexibility—recognizing that a good tutoring response in Hindi may not be a literal translation of the English response due to cultural norms for politeness, indirectness, and scaffolding. Compare against multiple baselines (zero-shot, translate-test, supervised fine-tuning, and prior cross-lingual distillation methods) on two dimensions: (a) task performance and data efficiency, and (b) preservation of pedagogical quality (feedback appropriateness, cultural sensitivity, dialogue naturalness) in low-resource languages. Assess whether the method avoids degrading high-resource language performance (preventing catastrophic forgetting of English pedagogical capabilities), a known risk when lowering KL constraints in alignment methods.

Requirements:

Interest in: LLM Alignment, Cross-Lingual Transfer, Educational Technology, Pedagogical
NLP

Skills: Python, PyTorch, Hugging Face Transformers, familiarity with RL concepts

Level: Master

Supervision: Jiaxu Zhao (postdoc)