Student Projects

  • We offer a wide variety of projects in the areas of Machine Learning, Optimization and NLP. The list below is not complete but serves as an overview.
  • Students who are interested to do a project at the MLO lab are encouraged to have a look at our

    where we describe what you can expect from us (and what we expect from you).

  • If you have not (yet) taken our curses we might ask for your grade sheet for getting to better know your background in the topic area.
  • We are only able to supervise a limited number of projects each semester. Priority will be given to Master Thesis Projects (full time).

Available MSc, BSc and PhD Semester Projects

  • Distributed & Decentralized Machine Learning using PyTorch, TensorFlow and other frameworks
    (Practical project)
    The goal in this project is to build and evaluate robust, open-source, system-adaptive and correct implementations of distributed optimization schemes for machine learning, as part of our MLbench open source project. Required skills are basic experience with open source / python, and either knowledge on writing efficient numerical code, or theoretical interest in optimization methods for DL. Related directions are the extensions to decentralized and federated learning, or autoTrain Contact: Martin Jaggi
  • Learning Multilingual Text Representations
    (Algorithm design and experiments), building on top of sent2vec and other models (PyTorch code available also), we investigate the quality and efficiency of multilingual representations of words, sentences and documents. Contact: Prakhar Gupta
  • Efficient Collaborative Deep Learning Training with non-iid data
    In current distributed SGD algorithms (centralized/decentralized), when data are non-iid distributed between workers (e.g. every worker has training examples only from one class), the generalization of final solutions is often very poor. The project is aimed to experimentally understand the reasons and possible solutions of this problem. Contact: Tao Lin and Anastasia Koloskova
  • Adaptive Importance Sampling to Speed Up SGD and Related ML Algorithms
    (Theory and/or practical project)
    Using more important training examples more often during training is a promising direction for speeding up SGD. As the importances of the samples are changing, finding good adaptive distributions is the main challenge here. Importance sampling can be explored for coordinate descent (CD) and stochastic gradient descent (SGD) type of algorithms. (Related papers: Safe Bounds, Adaptive Sampling, DL applicationContact: Sebastian Stich
  • Computational Aspects of Learning Text Sequence Representations
    (Algorithm design and analysis), building on top of sent2vec (PyTorch code available also). Contact: Prakhar Gupta
  • Distributed Methods in Machine Learning
    (Theory and implementation)
    Study the convergence of optimization algorithms (like CD, SGD, SVRG) in a distributed or decentralized setting. Contact: Sebastian Stich
    • Distributed SVRG: design of a new algorithm & convergence analysis. (Related paper: k-SVRG).
    • Decentralized Local SGD: convergence theory (Related paper: local SGD)
    • Decentralized SGD with Communication Compression: can we extend the scheme to asynchronous communication? (Related paper: Choco SGD)
    • Decentralized SGD on many nodes: how do decentralized methods scale on many (> 10k) nodes? (Related paper: Choco SGD)
    • Adaptive schemes for communication compression (Related papers: sparsified SGD, Normalized Gradients)
    • Basis Pursuit, explore new ideas for communication efficient SGD
  • Acceleration and Momentum for Deep Learning
    (Theory)
    “Momentum” is indispensable for fast convergence of SGD. In this project, we try to have a new look at different momentum-schemes. Contact: Sebastian Stich
  • Robust Training Methods
    (Implementation and Theory)
    As adversarial examples and robust training methods become more and more understood (Robust training, Aversarial examples are features) efficient training methods become increasingly important. Contact: Sebastian Stich for more details
    • Training methods for adversarial robustness
    • More realistic adversarial examples
    • Improving Adversarial Robustness
  • Model-Parallel Training of Deep Learning
    • (Practical project), please contact Tao Lin for more details.
    • (Theory project) efficient training methods, such as e.g. Pipe Dream rely on ‘asynchronous back-propagations’. Can we understand the effect this asynchronity from a theoretical point of view? See also this paper. Contact: Sebastian Stich
  • Low Precision Arithmetic
    (mostly theory)
    Most hardware DL accelerators rely on limited precision arithmetic. We start by compiling a survey of existing techniques (with corresponding error bounds) and will explore the possibilities of error feedback sparsified SGD). Contact: Sebastian Stich

Contact us for more details!

Interdisciplinary Projects

  • ML for Advanced Manufacturing

In the context of manufacturing nano-scale sensors consisting of individual carbon nanotubes, we use a machine learning approach to classify Raman spectroscopy data. Based on a large labelled dataset of 1-dimensional Raman Spectroscopy measurements, we detect for each measurement position the presence of a healthy nanotube as well as its orientation. In addition to standard ML techniques, we also employ unsupervised signal representation learning based on CNNs in a semi-supervised setting.
Contact: Martin Jaggi

  • Dream Prediction from EEG data

The goal of this project is to apply machine learning to predict when (and what!) a sleeper is dreaming based on electroencephalogram (EEG) time series. We have access to a large dataset consisting of 500 Hz signals from 256 electrodes over the 2 minutes before awakening, labeled by whether the sleeper remembered a dream or not. We will propose to use a convolutional neural network (CNN) to classify these EEG time series.

Contacts: Francesca Siclari, CHUV and Martin Jaggi, EPFL

  • Creating dynamic diagnostic algorithms for resource-limited settings in Africa

(in collaboration with Unisanté and the Swiss TPH)
The WHO has developed a set of algorithms to guide clinicians through structured consultations in resource-limited settings. However, these algorithms are static, generic, rule-based decision trees printed on paper that are derived from outdated data or extrapolated from non-representative populations.

iPOCT (intelligent point of care test) is an electronic version of these paper guidelines, which aims to better adapt decision trees to changing trends in local data in such a way that it improves health outcomes and reduces resource consumption. The static version of iPOCT (without data-driven algorithms) is currently fully functional and will be deployed in Tanzania and Rwanda in 2020 to guide over a million consultations.

You will work with the clinical and IT teams of a large interdisciplinary project to build a novel pipeline that allows clinicians to generate, explore and implement interpretable machine learning algorithms derived from the data collected with the iPOCT tool. A prototype of the platform is already functional. During this project, you will be closely supported and co-supervised by a medical doctor and guided by the requirements of the clinical and IT teams.

As part of this project, you will be traveling to Tanzania to participate and advise in an academic exchange program to promote machine learning in low-resource settings.

Contacts: Mary-Anne Hartley

  • Detecting and visualising patterns in medical data to guide targeted emergency interventions and medical training

(in collaboration with humanitarian aid organisations)
Several large humanitarian organisations such as Terre des Hommes (TdH) and Médecins Sans Frontières (MSF) have developed mobile applications to guide clinicians through consultations in order to improve and standardise the clinical management of children affected by emergencies and disasters. So far, these tools have collected systematic medical data during hundreds of thousands of consultations. We propose to build a dedicated data analysis platform that will  automate the early detection of anomalies and patterns of urgency that could better guide and scale their interventions, alert them to evolving trends and provide informative feedback for supervision and training in remote areas.

You will build a data analysis platform, exploring various metrics and approaches of semi-supervised and unsupervised anomaly detection. Specific attention will be needed to translate the findings into informative, interactive visualisations. During this project, you will be closely supported and co-supervised by a medical doctor and guided by the requirements of the participating humanitarian aid team.

Contacts: Mary-Anne Hartley

  • Collaborative privacy: Empowering clinical research in Africa through secure, incentivised crowdsourcing 

(in collaboration with the Broad Institute, MIT)
Sub-Saharan Africa suffers over a quarter of the global burden of disease, but despite this concentration of critical health information, the region currently produces less than 1% of global medical publications. In many clinics, data collection is still paper-based and clinicians are reluctant to collaborate with other parties due to important privacy concerns for their patients and the ownership of their intellectual research property.

You will explore approaches in distributed and federated learning to build a platform able to crowdsource models from multiple parties in a way that incentivises fair collaboration and high-quality interoperable data collection. Your models will be integrated into a prototype mobile data-entry application (already developed).

You will be supported by an interdisciplinary team from EPFL (clinical and ML) and the Broad institute (computational science and IT).

Contacts: Mary-Anne Hartley

Completed Thesis Projects

Master Theses (internal at EPFL):

Ahmad Ajalloeian, 2019, Stochastic Zeroth-Order Optimisation Algorithms with Variance Reduction
Akhilesh Gotmare, 2019, Layerwise Model Parallel Training of Deep Neural Networks
Andreas Hug, 2018, Unsupervised Learning of Embeddings for Detecting Lexical Entailment
Jean-Baptiste Cordonnier, 2018, Convex Optimization using Sparsified Stochastic Gradient Descent with Memory
Lie He, 2018, COLA: Communication-Efficient Decentralized Linear Learning
Wang Zhengchao,  2017,  Network Optimization for Smart Grids
Marina Mannari,  2017,  Faster Coordinate Descent for Machine Learning through Limited Precision Operations
Matilde Gargiani,  2017,  Hessian-CoCoA: a general parallel and distributed framework for non-strongly convex regularizers

Internships:
Polina Kirichenko,  2018,  Zero-order Optimization for Deep Learning
R S Nikhil Krishna,  2018,  Importance Sampling and LSH
Prashant Rangarajan,  2018,  Multilingual matrix factorizations
Jeenu Grover,  2018,  Learning 2 Learn
Anastasia Koloskova, 2017,  Coordinate Descent using LSH
Vasu Sharma,  2017,  CNNs for Unsupervised Text Representation Learning
Pooja Kulkarni,  2017,  Variable metric Coordinate Descent
Tao Lin,  2017,  Adversarial Training for Text
Tina Fang,  2017,  Generating Steganographic Text with LSTMs
Valentin Thomas,  2017,  Model-parallel back-propagation
Anahita Talebi Amiri,  2017,  Lasso – Distributed and Pair-Wise Features

Semester Projects:
Jan Benzing, 2019, A Machine-Learning approach for imputation of missing values in a biomedical dataset of febrile patients in Tanzania
Hongyu Luo, 2019, Dream Detection from EEG Time Series
Nicola Ischia, 2019, Dream Detection from EEG Time Series
Sidakpal Singh, 2019, Structure-aware model averaging with Optimal Transport
Brock Grassy, 2019, Nano Manufacturing with ML and Raman Spectroscopy
Atul Kumar Sinha, 2019, Unsupervised Sentence Embeddings Using Transformers
Hajili Mammad, 2019, Unsupervised Sentence Embeddings Using Transformers
Claire Capelo, 2019, Adaptive schemes for communication compression: Trajectory Normalized Gradients for Distributed Optimization
Aakash Sunil Lahoti, 2019, Theoretical Analysis of Minimum of Sum of Functions
Jelena Banjac, 2019, Software Tools for Handling Magnetically Collected Ultra-thin Sections for Microscopy
Nikita Filippov, 2019, Differentially Private Decentralized Batch SGD Under Varied Conditions
Devavrat Tomar, 2019, Neural Voice Conversion
Lingjing Kong, 2019, Adaptive Methods for Large Batch Training
Peilin Kang, 2019, Implementation of Model Parallelism Training
Nicolas Lesimple, 2019, Automated Machine Learning
Ezgi Yuceturk, 2018, Dream Detection from EEG Time Series
Delisle Maxime, 2018, Twitter Demographics
Bojana Rankovic, 2018, Handwritten Text Recognition on Student Essays
Cem Musluoglu, 2018, Quantization and Compression for Distributed Optimization
Quentin Rebjock, 2018, Error Feedback Fixes SignSGD and other Gradient Compression Schemes
Jimi Vaubien, 2018, Derivative-Free Empirical Risk Minimization
Ali Hosseiny, 2018, Human Query – From natural language to SQL
Servan Grüninger, 2018, Location prediction from tweets
Marie-Jeanne Lagarde, 2018, Steganographic LSTM
Sidakpal Singh, 2018, Context Mover’s Distance & Barycenters: Optimal transport of contexts for building representations
Arthur Deschamps, 2018, Simulating Asynchronous SGD + numerical results
Chia-An Yu, 2018, Feedback & quantization in SGD
Kshitij Kumar Patel, 2018, Communication trade-offs for synchronized distributed SGD with large step size
Martin Josifoski, 2018, Cross-lingual word embeddings
William Borgeaud,  2017,  Adaptive Sampling in Stochastic Coordinate Descent
Castellón Arevalo Joel,  2017,  Complexity analysis for AdaRBFGS: a primitive for methods between first and second order
Alberto Chiappa,  2017,  Asynchronous updates for Stochastic Gradient Descent
Ahmed Kulovic,  2017,  Mortality Prediction from Twitter
Arno Schneuwly,  2017,  Correlating Twitter Language with Community-Level Health Outcomes
Sina Fakheri,  2017,  A Machine-learning Mobile App to support prognosis of Ebola Virus Diseases in an evolving environment
Remy Sun,  2017,  A Convolutional Dictionary Method to Acquire Sentence Embeddings
Hakan Gökcesu,  2016,  Distributed SGD with Fault Tolerance
He Lie,  2017,  Distributed TensorFlow implementation of sparse CoCoA
Oberle Jeremia,  2016,  A Machine-Learning Prediction Tool for the Triage and Clinical Management of Ebola Virus disease
Akhilesh Gotmare,  2016,  ADMM for Model-Parallel Training of Neural Networks

Francesco Locatello: Greedy Optimization and Applications to Structured Tensor Factorizations,
Master thesis, ETH, September 2016

Dmytro Perekrestenko: Faster Optimization through Adaptive Importance Sampling,
Master thesis (jointly supervised with Volkan Cevher), EPFL, August 2016

Elias Sprengel: Audio Based Bird Species Identification using Deep Learning Techniques,
Master thesis (jointly supervised with Yannic Kilcher), ETH, August 2016

Jonathan Rosenthal: Deep Learning for Go
Bachelor thesis (jointly supervised with Yannic Kilcher and Thomas Hofmann), ETH, June 2016

Maurice Gonzenbach: Sentiment Classification and Medical Health Record Analysis using Convolutional Neural Networks,
Master thesis (jointly supervised with Valeria De Luca), ETH, May 2016

Jan Deriu: Sentiment Analysis using Deep Convolutional Neural Networks with Distant Supervision,
Master thesis (jointly supervised with Aurelien Lucchi), ETH, April 2016

Pascal Kaiser: Learning city structures from online maps,
Master thesis (jointly supervised with Aurelien Lucchi and Jan Dirk Wegner), ETH, March 2016

Adrian Kündig: Prediction of Cerebral Autoregulation in Intensive Care Patients,
Master thesis (jointly supervised with Valeria De Luca), ETH, January 2016

Bettina Messmer: Automatic Analysis of Large Text Corpora,
Master thesis (jointly supervised with Aurelien Lucchi), ETH, January 2016

Tribhuvanesh Orekondy: HADES: Hierarchical Approximate Decoding for Structured Prediction,
Master thesis (jointly supervised with Aurelien Lucchi), ETH, September 2015

Jakob Olbrich: Screening Rules for Convex Problems,
Master thesis (jointly supervised with Bernd Gärtner), ETH, September 2015

Sandro Felicioni: Latent Multi-Cause Model for User Profile Inference,
Master thesis (jointly supervised with Thomas Hofmann, and 1plusX), ETH, September 2015

Ruben Wolff: Distributed Structured Prediction for 3D Image Segmentation,
Master thesis (jointly supervised with Aurelien Lucchi), ETH, September 2015

Simone Forte: Distributed Optimization for Non-Strongly Convex Regularizers,
Master thesis (jointly supervised with Matthias Seeger, Amazon Berlin, and Virginia Smith, UC Berkeley), ETH, September 2015

Xiaoran Chen: Classification of stroke types with SNP and phenotype datasets,
Semester project (jointly supervised with Roqueiro Damian and Xiao He), ETH, June 2015

Yannic Kilcher: Towards efficient second-order optimization for big data,
Master thesis (jointly supervised with Aurelien Lucchi and Brian McWilliams), ETH, May 2015

Matthias Hüser: Forecasting intracranial hypertension using time series and waveform features,
Master thesis (jointly supervised with Valeria De Luca), ETH, April 2015

Lei Zhong: Adaptive Probabilities in Stochastic Optimization Algorithms,
Master thesis, ETH, April 2015

Maurice Gonzenbach: Prediction of Epileptic Seizures using EEG Data,
Semester project (jointly supervised with Valeria De Luca), ETH, Feb 2015

Julia Wysling: Screening Rules for the Support Vector Machine and the Minimum Enclosing Ball,
Bachelor’s thesis (jointly supervised with Bernd Gärtner), ETH, Feb 2015

Tribhuvanesh Orekondy: dissolvestruct – A distributed implementation of Structured SVMs using Spark,
Semester project, ETH, August 2014

Michel Verlinden: Sublinear time algorithms for Support Vector Machines,
Semester project, ETH, July 2011

Clément Maria: An Exponential Lower Bound on the Complexity of Regularization Paths,
Internship project (jointly supervised with Bernd Gärtner), ETH, August 2010

Dave Meyer: Implementierung von geometrischen Algorithmen für Support-Vektor-Maschinen,
Diploma thesis, ETH, August 2009

Gabriel Katz: Tropical Convexity, Halfspace Arrangements and Optimization,
Master’s thesis (jointly supervised with Uli Wagner), ETH, September 2008