On this page you can find our offerings for Master’s Projects and Master and Bachelor Research Projects in the realm of data mining and machine learning for (vocational) education ecosystems for the spring semester 2026. Please note that this list of projects is not exhaustive and will be updated over the coming days with further exciting projects.
Last update: 21.11.2025
How to apply
Please apply via our student project application form. You will need to specify which project(s) you are interested in, why you are interested and if you have any relevant experience in this area. To access the form, you need to log in with your EPFL email address. If you would like to receive more information on our research, do not hesitate to contact us! Students who are interested in doing a project are encouraged to have a look at the Thesis & Project Guidelines, where you will gain an understanding about what can be expected of us and what we expect from students.
The student project application form will remain open for submissions until the late deadline of 31 January 2026. Applications will be reviewed on a on-going basis, so we strongly encourage you to apply as soon as possible. Early applicants will be considered starting the in calendar week 49, with the early deadline set for 02 December 2025:
- Early deadline (first round): 02.12.2025
- First contact with supervisors: 04.12.2025 – 12.12.2025
- Late deadline: 31.01.2026
External students: Non EPFL students are kindly requested to get in touch with the project(s) supervisors by e-mail.
Please, note that the list of topics below is not exclusive. In case you have other ideas or proposals, you are welcome to contact a senior member of the lab and talk about possibilities for a tailor-made topic.
Project 1: Behavioural and Confidence Profiles in Interactive Science Simulations
Â
Project 2: Joint Semantic Modeling of Information Sources and Student Writing Processes
Producing a coherent written product from multiple information sources is cognitively demanding, particularly for younger learners. Understanding how students navigate sources, take notes, and iteratively revise their texts offers a window into their self-regulated learning (SRL) processes. Despite growing interest in SRL analytics, the computational modeling of relationships between learnersâ text edits, their navigation behavior, and the content of the materials they consult remains underdeveloped.This ML4ED project investigates multimodal trace data from a project-based learning unit in primary school. The dataset includes over 600 studentsâ evolving magazine articles, their notes, navigation logs, and the 100 text and video resources they consulted. Your task is to design and evaluate methods that connect what students wrote, when they wrote it, and which information sources they accessed. The goal is to predict external SRL ratings of student from this perspective on their learning process. You will work with techniques from Natural Language Processing, representation learning, and sequence analysis. Depending on your chosen direction, the project may also involve large visionâlanguage models or multimodal embedding techniques.
The project will focus on:
-
Semantic and structural alignment between sources and writings
-
Modeling temporal writing behavior
-
Predicting SRL labels from the joint modeling of text semantics and student learning actions
Requirements:
-
Proficiency in: Python, NLP
-
Interest in: Learning Sciences, Educational Data Mining
-
Level: Master
Supervision:Â Dominik Glandorf (PhD student)
Project assigned: Project 3: Aligning VLM behavior with human-like exploration patterns
Recent advances in vision-language models (VLMs) have enabled them to operate within interactive, open-ended simulation environments. However, little is known about how these models explore, learn, and adapt in such environments compared to humans. This project aims to systematically analyze VLM behavior in open-ended learning simulations.We will evaluate how multimodal models interact with and navigate simulated environments, characterizing their decision-making patterns, exploration trajectories, and learning dynamics. These behavioral patterns will be compared to human exploration data to identify similarities between artificial and human learners.Building on these insights, the project will develop methods to align VLM behavior with human-like exploration patterns. By leveraging human exploration sequences and behavioral priors, we aim to create VLMs that can mimic key aspects of human learning behavior in order to generate synthetic exploration traces of human learners.
Required Skills: Experience with VLMsProficiency in: Python; ML/NLP librariesBonus: Research experience in NLPFurther details:This semester project is intended for one MSc student (semester project or thesis) with technical background in NLP. The student will be supervised by PhD student Marta KneĆŸeviÄ.
Project 4: Understanding and Diagnosing Faithfulness in Feedback-Generating Language Models Using Sparse Autoencoders
Language models (LLMs) are increasingly used to give students personalised, structured feedback. However, smaller open-source models are still prone to making mistakes: they often miss key errors in a studentâs solution, offer irrelevant next steps, or mix up different types of advice. As a result, the feedback they produce is not always faithful to what the student actually wrote.Recent interpretability methods, especially Sparse Autoencoders (SAEs), let us look inside LLMs and examine which internal features activate when they generate different parts of feedback. This project investigates whether these tools can help explain why small LLMs produce unfaithful feedback, and which internal patterns are linked to helpful or unhelpful recommendations.
Objectives:
- Use pretrained SAEs and interpretability tools (e.g., Neuronpedia, circuit-tracer) to analyse how a small LLM processes student answers and generates feedback.
- Identify internal features that correlate with:
- accurate vs inaccurate detection of student mistakes
- faithful vs unfaithful next-step recommendations
- Build simple diagnostic tools (e.g., lightweight classifiers or clustering scripts) that automatically flag likely feedback issues based on model activations.
- Develop a reusable analysis toolkit to support future research on improving the faithfulness and transparency of LLM-generated feedback.
- Interest in: Interpretability, NLP, Educational Technology, Model Behaviour Analysis
- Skills: Python, basic machine learning, familiarity with Hugging Face and PyTorch
Project 5: Soft-Skills and Diagnostic Reasoning in a Virtual Pharmacy (PharmaSim VR)
- Design a VR pharmacy environment with multiple virtual customers waiting in line.âą Implement varied game character personalities (e.g., impatient, anxious, annoyed, calm) and behaviors.
- Use LLMâs to generate natural, adaptive client dialogue.
- Log interactions for later analysis.Requirements:
- Interest in: Educational technology, VR, Games, Simulations
- Skills: Unity (C#), NLP, LLM
- Bonus: Experience with VR development, conversational agents
Project 6: Exploring models and interfaces to support reflection and metacognition
Reflection is a key component of deep learning and metacognitive growth, and helps learners identify their strengths and weaknesses, helping them to improve over time. Our lab has conducted research on multiple grounds regarding reflection strategies and reflective writing, and has prepared multiple datasets consisting of the reflective writings provided by vocational students. This project explores how personal, collaborative, or group-based reflection can be supported through digital and AI-powered tools, expanding the traditional view of reflection as solely a written, individual activity.
You will begin by reviewing recent research on reflection and its mediation through digital platforms. The project may involve one of two aspects: A) designing NLP pipelines to understand attributes and characteristics of reflections with the aim to provide relevant support to students, including analyzing log data from the interventions previously conducted by the lab in Swiss classrooms, or B) prototyping or evaluating modalities beyond text, such as voice-based or multimodal reflection, on-the-fly concept mapping, and immersive reflection support using AR/VR technologies. The goal is to investigate how technology can enhance shared reflective processes in educational contexts. The direction of the project can include model evaluation/analysis, system design, developing necessary ML/NLP models, and/or technical prototyping.
The project will focus on several of the aspects below:
- Reviewing research on AI-mediated reflection
- Developing necessary ML/NLP models and conducting NLP evaluations
- Designing and/or evaluating interaction methods + investigating educational applications of reflection tools
⊠with the goal of a CHI or ACL/EMNLP submission.
Requirements: Proficiency in: basic ML/NLP (bonus: front-end development experience)
Interest in: Learning Sciences, HCI, interaction design, AI in Education, Text Retrieval and Feedback Generation
Level: MasterSupervision: Seyed Parsa Neshaei (PhD student)