Semester Projects and Master Theses

Check this page regularly, we add new projects all the time.

All the projects listed below can be either a semester project or master thesis, where the depth and workload can be adjusted accordingly by discussion. If you cannot find anything interesting to you on the list, we always encourage students to bring their original ideas to us. 

Projects (Master/Bachelor)

Speech Recognition algorithms have achieved higher accuracies in the last decades. However, since their performance is mostly dependent on the sound, there is a remarkable drop in it when only low-quality audio is available. This fact invalidates most of activities that depend on verbal communication to be autonomous. Therefore, in environments where the noise level is high, such as buildings close to main avenues or classrooms during the school break time, the autonomous interaction with robots through verbal communication is drastically affected. The addition of visual analyses of the sentences through Lip Reading has shown potential alternatives to overcome this issue in its recent studies. In this project, Audio-Visual Speach Recognition (AVSR) algorithms will be improved and validated by users in social robots for verbal interactions in noisy environments. The algorithm uses Convolutional Neural Networks (CNN) to visually recognize visemes and combine them with the phonemes recognized by audio. ROS nodes are used to distribute the process and perform the high-cost computations on a powerful server to make the system feasible in real-time (for the user). In the experiments of this thesis, the robot will interact with users in noisy environments to test the accuracy of the algorithm and the user’s acceptability of the system performance.

Keywords:  ROS, Machine Learning, Deep Learning, OpenCV, Distributed Systems.

Contact: [email protected]

Jupyter notebooks have become an essential tool used in data science, scientific computing, and machine learning in both industry and academia. Cloud based Jupyter notebooks like Google Colab, Noto, and Jupyter Hub bring the power of Jupyter notebooks into the cloud and make it easier to share and collaborate. At EPFL and other universities, these cloud-based Jupyter notebooks are used as interactive textbooks, platforms for distributing and grading homework, and as simulation environments.

These notebooks produce rich logs of interaction data, but there is currently no easy way for teachers and students to view and make sense of this data. This data could provide a valuable source of feedback that both teachers and students could use to improve their teaching and learning. This way of using data is called learning analytics, and we have recently begun designing a software extension that will bring the power of learning analytics directly into cloud-based Jupyter notebooks.

We are looking for students to join in the development of this learning analytics tool with any of the following interests: data visualization, full-stack web development, UX research, learning analytics, education.

Contact: [email protected] or [email protected]

The goal of this project is to apply recent advances in deep generative modeling (e.g., Stable Diffusion, GigaGAN) to the development of creativity support tools. Previously, our lab implemented a creativity support tool by customizing and training a model based on the StyleGAN2-ada architecture, then building a web-based interface to the model. This work was published in NeurIPS [1] and CHI [2], and a video demo of the existing interface can be seen at https://youtu.be/dcC7G2zBuL8.

This project will expand the expressiveness and power of our tool by incorporating more powerful models. Additionally, this project will involve iterating on the user interface, exploring new ways of visualizing and exploring the vast space of possible designs. Ideally, we will also test this interface with stakeholders to better understand its strengths and weakneses and to collect evidence of its effectiveness.

Students interested in deep generative models for image synthesis, GANs, diffusion models, HCI, full-stack web development, UX research, and human-centered artificial intelligence should contact [email protected].

[1] Jiang, W., Davis, R. L., Kim, K. G., & Dillenbourg, P. (2022). GANs for All: Supporting Fun and Intuitive Exploration of GAN Latent Spaces. NeurIPS 2021 Competitions and Demonstrations Track, 292–296.
[2] Davis, R. L., Wambsganss, T., Jiang, W., Kim, K. G., Käser, T., & Dillenbourg, P. (2023). Fashioning the Future: Unlocking the Creative Potential of Deep Generative Models for Design Space Exploration. Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 1–9.

This project aims to leverage LLMs to improve feedback generation on student work in fablab. Feedback is crucial for effective learning outcomes, but it can be challenging to provide insightful and timely feedback. By leveraging the langchain framework, we plan to develop sophisticated prompting systems to iteratively assess and enhance feedback based on learning sciences principles. This process will also explore recent developments in LLMs, including generating intermediate reasoning steps using “chain of thoughts” approaches to enrich feedback generation. We will also explore how to “keep the human in the loop” so that teacher input is leveraged to maintain the authenticity and reliability of the feedback. The ultimate goal is to create a versatile platform for feedback generation across various academic domains.

Contact: Bertrand Schneider

This project will expand the expressiveness and power of our tool by incorporating more powerful models. Additionally, this project will involve iterating on the user interface, exploring new ways of visualizing and exploring the vast space of possible designs. Ideally, we will also test this interface with stakeholders to better understand its strengths and weakneses and to collect evidence of its effectiveness.

Students interested in deep generative models for image synthesis, GANs, diffusion models, HCI, full-stack web development, UX research, and human-centered artificial intelligence should contact [email protected].

[1] Jiang, W., Davis, R. L., Kim, K. G., & Dillenbourg, P. (2022). GANs for All: Supporting Fun and Intuitive Exploration of GAN Latent Spaces. NeurIPS 2021 Competitions and Demonstrations Track, 292–296.
[2] Davis, R. L., Wambsganss, T., Jiang, W., Kim, K. G., Käser, T., & Dillenbourg, P. (2023). Fashioning the Future: Unlocking the Creative Potential of Deep Generative Models for Design Space Exploration. Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 1–9.

Learning how to grip the pen properly is a fundamental part of handwriting training for children, which requires constant monitoring of their pen grip posture and timely intervention from teachers. Various sensing technologies have been explored to automate the pen grip posture estimation, like the camera-based system or using an EMG armband. In the context of digital writing, namely, writing on tablets, these solutions with additional sensors lack portability. In this project, we aim to tackle this challenge by exploiting the combination of the reflector and the frontal camera of a tablet for on-device pen gripping posture prediction. Together with the accessible pen tip location and orientation, which are strongly coupled with the hand pose, we postulate that the performance of pen gripping posture prediction can be further improved.

To this end, in this research project, we will work on an iPad and a customized reflector, build an iOS application, and develop new ML algorithms for gripping posture prediction. We will have weekly meetings to address questions, discuss progress, and think about future ideas. 

We are looking for students with any of the following interests: Machine Learning, Mobile Computing, and Human-Computer Interaction. Relevant IT skills include Python and the basics of Swift. Experience with iOS development can be beneficial. If you are interested, do not hesitate to contact me.

Contact: [email protected]

Various human-robot interaction systems have been designed in supporting children’s learning of language and literacy. They commonly foster learning in a way of utilizing social robots to engage a learner in the designed educational activities, which heavily rely on the human-robot social relationship. With the recent boosting development of the Large Language Model (LLM), it shows the potential of empowering social robots with the ability of language understanding and generation, to elicit and maintain a social and emotional bond between humans and robots. In this project, we will work on building the QT Cowriter, an intelligent conversational handwriting companion robot powered by ChatGPT, that can interactively talk with children like a friend and hold a pen to write on tablets. 

To this end, in this research project, we will work with QTrobot and Wacom tablets. We will explore different LLMs and OpenAI APIs. We will develop state-of-the-art algorithms and implement innovative human-robot interaction systems. We have weekly meetings to address questions, discuss progress, and think about future ideas. 

We are looking for students with any of the following interests: HRI/HCI, Machine Learning, and Large Language Models. Relevant IT skills include Python and ROS basics. If you are interested, do not hesitate to contact me.

Contact: [email protected]

Makerspaces are collaborative workspaces that provide students with hands-on learning opportunities to explore various concepts through prototyping and making. By utilizing 3D pose data, collected through sensors and cameras 24/7 during a semester-long course, this project seeks to develop insights into how students interact with the makerspace environment, their tools, and projects. The objective is to identify meaningful patterns, correlations, and insights regarding student behavior, engagement levels, and learning progress within the makerspace. The project will also involve visualizing the analyzed data in an informative and user-friendly manner for further analysis and interpretation. This project is an opportunity to contribute to the emerging field of multimodal learning analytics and innovative ways to enhance student learning in makerspaces. Ultimately, the outcomes of this project will produce recommendations for educational institutions and makerspaces seeking to optimize learning and engagement among students.

Contact: Bertrand Schneider

The COVID-19 pandemic has highlighted the limitations of video conferencing tools like Zoom when it comes to facilitating effective online learning and teaching, especially for hands-on, project-based approaches. This project aims to address these limitations by exploring the potential of new devices, such as augmented and virtual reality (AR/VR) headsets, to promote more immersive and interactive exchanges in online spaces. By combining physical and virtual representations, we envision creating learning environments that bridge the gap between face-to-face and remote education. The provided proof of concept video (https://youtu.be/nfCsT74ixdE) showcases the preliminary work done in this area and demonstrates the potential for leveraging AR/VR technology in enhancing online learning experiences. The project will utilize Unity and C# to create mixed reality experiences. This project has the potential to improve online education by offering new opportunities for hands-on learning, regardless of students’ location.

Contact: Bertrand Schneider

This project aims to create a chat-based agent for analyzing and interpreting data. Initial prototypes will  create ReAct agents to answer questions about data (using LangChain and the OpenAI API). For example, rather than writing the code to run a regression in R or Python, a user could simply ask “What is the relationship between variable y and X?” The agent will be responsible for deciding which type of analysis to run, running it, and reporting the results back to the user.

Part of this project will involve creating a set of “tools” that the ReAct agent can use to perform data analysis. This is similar to creating a well-defined API for different operations that the ReAct agent can access. More information about the creation of custom tools can be found at https://python.langchain.com/en/latest/modules/agents/tools/custom_tools.html.

As the project matures, there will be a need to begin working with a local large langauge model such as vicuna-13b for data privacy reasons. Part of the project may involve implementing and potentially fine-tuning such a model, and then comparing its performance to the OpenAI models such as gpt-3.5-turbo.

Students interested in applications of large language models, data science, and human-centered artificial intelligence should contact [email protected].

Educational activities in Human-Robot Interaction (HRI) commonly do not take into consideration the teachers in their design and evaluation. This is an important piece missing since teachers are the ones who know better their students and the ones who need the most to know how their students are performing in such activities. This gap is long mainly because the tools to program such robots are fuzzy for non-expert programming. This project aims to develop features for Graphical User Interfaces that facilitate teachers designing and evaluation of such activities in an intuitive way. It will connect to already existing algorithms to control the robot that automates the execution of these interactions. After executing the activities, the interface will show to the teachers the data collected in easily readable visualization methods. The system should also be able to autonomously generate reports of the interactions. Validation of the GUI is expected to be performed with teachers through user utilization and interviews.

Keywords: Python, Interface Desing, REACT, UX.

Contact: [email protected]

Dyslexia is a specific and long-lasting learning disorder characterised by reading performances well below those expected for a certain age. Eye-tracking technology has been successfully used to study the reading behaviour of children. This semester project aims to explore its application in more ecologically valid scenarios and expand its usage. We started to develop an iOS application that utilizes the eye-tracking abilities of ARKit, Apple’s augmented reality framework, to study children gaze behaviour while interacting with the iPad. The goal of this semester project is: 1) improve the eye-tracking, 2) conduct experiments comparing the results obtained from the ARKit application with eye-tracking glasses, as well as across different iPad models, and 3) develop educational games that leverage eye-tracking technology for an engaging learning experience.

Since the application is on iPad, you will learn how to program in the Swift language. We seek students interested in IOS development, experiment design and game Design to join this project.

Contact: [email protected]

Dyslexia is a specific and long-lasting learning disorder characterised by reading performances well below those expected for a certain age. Given the importance of reading throughout a person’s life, it is not surprising that dyslexia has been extensively studied. Yet, there is no consensus about theories or explanations which may explain its origin. A recent promising research trend focused on abnormalities of the “internal clock” used to sample information as one of the main underlying deficits. We are developing several digital activities to explore that view.

In order to test that hypothesis, we need to measure children’s performances in those activities against some baseline cognitive skills. Thus, this project aims to implement several standard cognitive performance tests playfully and interestingly. Since the application is on iPad, you will learn how to program in the Swift language. We seek students interested in IOS development and Game Design to join this project.

Contact: [email protected]