Students Projects

If you are interested in doing a Master thesis or semester project in data science, you can apply for one of the projects bellow. Your primary supervisors will be Francisco Pinto and Patrick Jermann, from the Center for Digital Education (CEDE).

List of Projects


Detect EPFL Activities related to Sustainability

Level: Master

Subject area(s): Discrete mathematics, Graph theory.

Description: The EPFL knowledge graph (graphsearch.epfl.ch) maps all courses and publications to a network of scientific concepts extracted from Wikipedia, that we call the concepts graph. The goal of this project is to detect and analyse EPFL’s teaching and research activities that are related to the subject of sustainability (as required by recent policies from the Swiss federal government) using tools and algorithms from graph theory.

Pre-requisites: Proficiency in Python; solid knowledge of discrete mathematics and/or graph theory.

Useful tools: Python notebooks, NetworkX, Apache GraphX.

Contact: [email protected]
Please attach the grade transcripts from both your Bachelor and Master studies.


Implement Text Analysis Algorithm on a Distributed Platform

Level: Master

Subject area(s): Data engineering, distributed graph processing.

Description: One of the main pillars in the construction of EPFL’s AI-powered graph engine (graphsearch.epfl.ch) is a multithreaded text analysis algorithm that relates textual content to Wikipedia pages (Wikipedia is the basis of EPFL’s database of scientific concepts). The goal of this project is to implement the algorithm on a distributed platform using parallel processing techniques and tools, such as MapReduce and distributed graph processing.

Pre-requisites: Proficiency in Python or Scala; experience with distributed computing tools such as Hadoop and Spark; experience with the MapReduce programming model.

Useful tools: Python, Scala, Apache Hadoop, Spark, Giraph, GraphX.

Contact: [email protected]
Please attach the grade transcripts from both your Bachelor and Master studies.


Predict Research Collaborations based on Historical Paper Co-Authorships

Level: Master

Subject area(s): Machine learning, graph theory.

Description: The EPFL knowledge graph (graphsearch.epfl.ch) contains a historical network of paper co-authorships dating back 15+ years. The goal of this project is to use that historical data to predict, using machine learning techniques, and recommend future collaborations to our doctoral students and postdocs. NOTE: We receive many applications for our machine learning projects; keep in mind that we’ll give priority to students who have a solid mathematical knowledge of graph theory.

Pre-requisites: Proficiency in Python; experience with machine learning algorithms, including ANNs and RFs; mathematical knowledge of graph theory. Bonus if you’ve taken course(s) on discrete mathematics.

Useful tools: Python notebooks, NetworkX, Apache GraphX.

Contact: [email protected]
Please attach the grade transcripts from both your Bachelor and Master studies.


Implement a 3D Graph Navigation App in JavaScript

Level: Bachelor or Master

Subject area(s): Programming, 3D graphics.

Description: The goal of this project is to implement a webapp that fetches data from EPFL’s knowledge graph database (graphsearch.epfl.ch) and displays a 3D rendering of the graph on the frontend. The user will be able to navigate the graph on the three dimensions of space, either using a tablet computer or a VR headset [check out, for example, the WikiGalaxy project].

Pre-requisites: Excellent programming skills, preferable with experience in JavaScript; basic understanding of graphs.

Useful tools: Three.js, Node.js, GraphQL, ArangoDB.

Contact: [email protected]
Please attach the grade transcripts from both your Bachelor and Master studies.


Build Graph of Skills from Job Offers

Level: Bachelor or Master

Subject area(s): Data mining, text processing, web development.

Description: This project aims at improving the data mining and text processing algorithms underneath our skills navigation demo. There is a hacking component, where you’ll scrape PDF documents from a job offers portal, and a text processing component, where you’ll be able to play with natural language processing algorithms. The output is a graph that connects skills required by the job market and skills taught at EPFL.

Pre-requisites: Proficiency in Python; experience with text processing in Python; basic knowledge of REST APIs.

Useful tools: Python, Selenium, PhantomJS, SigmaJS.

Contact: [email protected]
Please attach the grade transcripts from both your Bachelor and Master studies.