On this page you can find our offerings for Master’s Projects and Master and Bachelor Research Projects in the realm of data mining and machine learning for (vocational) education ecosystems for the autumn semester 2025. Please note that this list of projects is not exhaustive and will be updated over the coming days with further exciting projects.
Last update: 02.07.2025
How to apply
Please apply via our student project application form. You will need to specify which project(s) you are interested in, why you are interested and if you have any relevant experience in this area. To access the form, you need to log in with your EPFL email address. If you would like to receive more information on our research, do not hesitate to contact us! Students who are interested in doing a project are encouraged to have a look at the Thesis & Project Guidelines, where you will gain an understanding about what can be expected of us and what we expect from students.
The student project application form will remain open for submissions until the late deadline of 15 August 2025. Applications will be reviewed on a on-going basis, so we strongly encourage you to apply as soon as possible. Early applicants will be considered starting the in calendar week 21, with the early deadline set for 23 May 2025:
- Early deadline: 23.05.2025
- First contact with supervisor: between 26.05.2025 and 03.05.2025
- Late deadline: 15.08.2025
External students: Non EPFL students are kindly requested to get in touch with the project(s) supervisors by e-mail.
Please, note that the list of topics below is not exclusive. In case you have other ideas or proposals, you are welcome to contact a senior member of the lab and talk about possibilities for a tailor-made topic.
Project 1: Improving Tool Use and Reasoning in Small Language Models for Interactive Feedback via RLHF
While supervised fine-tuning improves the actionability and relevance of responses, gains in tool relevance, correctness, and multi-step reasoning remain limited. To address this, we aim to incorporate Reinforcement Learning from Human Feedback (RLHF) to optimize models for more faithful tool use, accurate reasoning, and higher-quality responses.
Objectives:
- Design a custom RLHF pipeline for tool-augmented models, including a reward model that scores tool relevance, factual accuracy, and reasoning quality.
- Improve performance on reasoning, tool selection, and correctness, particularly in smaller models (â€8B) where these capabilities are underdeveloped.
- Evaluate the impact of RLHF and compare it with supervised fine-tuning
- Interest in: Natural Language Processing, Human-AI Feedback, Educational Technology
- Skills: Python, Hugging Face Transformers, Machine Learning
Level: Master
Supervision: Fares Fawzi (PhD student)
Project assigned: Project 2: Learning Contextualized Vector Representations of Self-Regulated Learning Behavior
- generalizable tokenization of log stream events
- methods to pre-train and fine-tune contextualized vector representations
- prediction of student characteristics from the representations
- Proficiency in: Python, NLP (transformers), Machine Learning
- Interest in: Learning Sciences, Student Modeling and NLP for Education
Level: Master
Supervision: Dominik Glandorf (PhD Student)
Project 3: Exploring Retrieval and NLP Approaches for Reflective Writing Support
Reflective writing fosters metacognitive and self-regulated learning, yet students often struggle to articulate meaningful reflections. This project investigates how modern NLP techniquesâsuch as retrieval-augmented generation (RAG), topic modeling, and large language modelsâcan support metacognitive writing, e.g., reflective writing, in educational settings.
You will begin by reviewing relevant retrieval and generation approaches and apply selected models to a dataset of student reflections. Models will be evaluated based on how well they retrieve meaningful prior reflections and generate helpful suggestions, using a combination of evaluation methods. Promising approaches may be further developed into tools or prototypes for educational use.
The project will focus on:
-
Reviewing NLP approaches for reflective and educational text
-
Applying and comparing models (e.g., RAG, BERTopic, LLMs)
-
Evaluating reflection retrieval and suggestion quality
-
Exploring potential educational applications or interfaces
… with the possible goal of an NLP submission (e.g., ACL, EMNLP, etc.)
Requirements:
- Proficiency in: Python, NLP (e.g., Transformers, topic modeling, retrieval)
- Interest in: Learning Sciences, Reflective Writing, NLP for Education
Level: Master
Supervision: Seyed Parsa Neshaei (PhD student)
Project assigned: Project 4: Exploring AI-Supported Collaborative Reflection in Learning Environments
Reflection is a key component of deep learning and metacognitive growth, but it doesnât always occur in isolation or through writing alone. This project explores how collaborative and group-based reflection can be supported through digital and AI-powered tools, expanding the traditional view of reflection as solely a written, individual activity.
You will begin by reviewing recent research on collaborative reflection and its mediation through digital platforms. The project may involve prototyping or evaluating modalities beyond text, such as voice-based or multimodal reflection, on-the-fly concept mapping, or even immersive reflection support using AR/VR technologies. The goal is to investigate how technology can enhance shared reflective processes in educational contexts. The direction of the project can include system design, empirical analysis, developing necessary ML/NLP models, and technical prototyping.
The project will focus on:
- Reviewing research on collaborative and multimodal reflection + AI-mediated reflection beyond written modalities
- Developing necessary ML/NLP models for the backbone of the system
- Designing and/or evaluating interaction methods + investigating educational applications of shared reflection tools
… with the possible goal of an HCI (e.g., CHI) or an AI in education (e.g., AIED) submission.
Requirements:
- Proficiency in: basic ML/NLP (bonus: front-end development experience)
- Interest in: Learning Sciences, HCI, interaction design, AI in Education
Level: Master
Supervision: Seyed Parsa Neshaei (PhD student)
Project 5: Federated RLHF for LLM Training on a AI Learning Platform
This project lies at the intersection of privacy-preserving machine learning, LLM training, and educational technology. Large Language Models (LLMs), such as GPT-4 and LLaMA, have transformed AI-driven education, but tailoring these models to align with user preferences while ensuring data privacy remains a critical challenge. Federated Learning (FL) combined with Reinforcement Learning with Human Feedback (RLHF) offers a unique solution to this problem, enabling distributed, privacy-conscious model training.
This project focuses on extending a current Federated RLHF training implementation and integrating it in ScholĂ©, a spinoff AI learning platform from the ML4ED lab designed for context-driven, job relevant education. Building on the foundation of FL frameworks, the project will explore how iterative feedback loops can be integrated across diverse user groups to refine the LLMâs capabilities. By leveraging user interactions from ScholĂ©, the aim is to develop a system that learns collaboratively without sharing sensitive data, fostering trust and personalization in educational AI systems.
We will also evaluate Federated RLHFâs impact on alignment, privacy preservation, and model performance using both qualitative and quantitative metrics.
Requirements:
- Interest in: Federated Learning, Reinforcement Learning, Large Language Models, Privacy-preserving AI
- Proficiency in: Python; Machine Learning frameworks (e.g., PyTorch, TensorFlow); Hugging Face Transformers
- Bonus: Experience with Federated Learning frameworks (e.g., Flower, FedML); knowledge of RLHF
Further details
This semester project is aimed at one MSc student (semester project or thesis) with strong technical skills. The student will be supervised by Vinitra Swamy (PostDoc), Paola Mejia Domenzain (PostDoc), and Maxime Perrot (Engineer). This project is aligned with a spinoff initiative from the ML4ED Lab called Scholé AI.
Project assigned: Project 6: Usability Studies for Enhancing ScholĂ©âs User Experience
This project sits at the intersection of Human-Computer Interaction (HCI), educational technology, and digital humanities. Scholé, an AI learning platform, strives to improve its usability and accessibility to better engage learners and educators. By integrating insights from digital humanities and HCI, this project seeks to design user studies and implement enhancements that align with human-centric design principles.
The focus will be on conducting structured usability studies, analyzing platform pain points, and iterating on design modifications to foster inclusivity and user engagement. This project also emphasizes interdisciplinary methods by incorporating theories and practices from digital humanities, ensuring the design process reflects diverse learner needs.
We will evaluate usability improvements through mixed-methods research, combining qualitative user feedback with quantitative usability metrics, and propose actionable recommendations for platform enhancements.
Requirements
- Interest in: Human-Computer Interaction and Usability Testing
- Bonus: Experience in interaction design, front-end development, or conducting user studies
Further details
This semester project is tailored for a digital humanities student, learning science student, or HCI enthusiast. The student will be supervised by Paola Mejia (PostDoc), Vinitra Swamy (PostDoc), and Maxime Perrot (Engineer). This project is aligned with a spinoff initiative from the ML4ED Lab called Scholé AI.
Project 7: Empower TAs with Pedagogical Chatbot
This project aims to investigate various scenarios for implementing a hybrid human-AI approach based on state-of-the-art large language models and generative AI techniques, develop pilot conversational AI systems, integrate them into Ed Discussion, and evaluate them on real EPFL student interactions.
- An interest in Educational Technology, Natural Language Processing, Conversational AI
- Proficiency in Python; Hugging Face Transformers; Machine Learning
- Optional: Experience with fine-tuning LLMs
Level: Master
Supervision: Tanya Nazaretsky (Postdoc)
Project 8: Evaluating the Robustness of LLM-Based In-Context Guidance
As Large Language Models (LLMs) are increasingly deployed in interactive settings, a key question arises: can LLMs provide effective guidance without leaking task-critical information or solutions? This project investigates the robustness and alignment of LLM-generated instructional content in agent-based simulations, focusing on the teacherâstudent paradigm as a testbed for probing the limits of in-context learning and instruction-following behavior.The primary goal is to design and implement a simulation framework in which two LLM agentsâa teacher and a studentâinteract iteratively. The teacher agent is tasked with providing contextually helpful guidance without revealing the full solution.This project will explore (1) the robustness of teacher models, (2) the effectiveness of in-context guidance in task progression without direct answer disclosure, (3) the design space of agentic LLM interactions, including role conditioning and memory management, (4) the emergent behaviors of student agents under varying instruction-following constraints.The resulting framework will serve as a platform to systematically evaluate alignment, leakage risks, and pedagogical robustness across model scales and prompting strategies.
Requirements:
- Interest in: Large Language Models, Robustness & Alignment, Multi-agent Simulation, In-context Learning
- Proficiency in: Python; ML/NLP libraries
- Bonus: Research experience in NLP, agent-based architectures
Further details:This semester project is intended for one MSc student (semester project or thesis) with technical background in NLP. The project will involve designing and building the framework from scratch. The student will be supervised by PhD student Marta KneĆŸeviÄ.
Project assigned: Project 9: Analyzing student experimentation behaviors in inquiry-based science simulations
The goal of the project is to identify similarities and differences across the datasets in terms of the experimentation strategies applied and their influence on studentsâ conceptual learning outcomes.
- Interest in: Learning Sciences, Data Science, Education
- Proficiency in: Python; Machine Learning (mainly, clustering)
Project assigned: Project 10: Video Lecture Content Analysis with Vision-Language Models for Learner Modeling
- Identification of a suitable taxonomy for the pedagogical description of educational video content
- Assessing methods to employ VLMs for the automated coding of video content
- Prediction of user interactions based on video content
- Proficiency in: Python, LLMs and NLP
- Interest in: Learning Sciences, Educational Data Mining
- Level: Master