Swiss Data Science Center

This page lists the Swiss Data Science Center projects available to EPFL students. The SDSC is a joint venture between EPFL and ETH Zurich. Its mission is to accelerate the adoption of data science and machine learning techniques within academic disciplines of the ETH Domain, the Swiss academic community at large, and the industrial sector. In particular, it addresses the gap between those who create data, those who develop data analytics and systems, and those who could potentially extract value from it. The center is composed of a large multi-disciplinary team of data and computer scientists, and experts in select domains, with offices in Lausanne and Zurich. datascience.ch

 

These projects are closed for applications

Laboratory: Swiss Data Science Center

Type: Semester Project

Description:

Dynamical systems such as the climate are highly nonlinear, and despite the fact that the observations are high-dimensional, most of the dynamics is captured by a small number of physically meaningful patterns.

In the first part of the project we will use unsupervised dimension reduction techniques for feature extraction, and compare linear techniques (e.g., Principal Component Analysis) with nonlinear kernel-based techniques (e.g., Laplacian Eigenmaps [1] and Diffusion Maps [2]). In the second part of the project we will forecast future values of the Laplacian eigenvectors using nonlinear regression techniques, such as Gaussian processes [3].

The data are three-dimensional (latitude*longitude*time) real-world global observations of temperature (and possibly rainfall), and observations are available for over 100 years. For example, one thing we are interested in is to extract the climate change trend in temperature and predict its future values.

Goals/benefits:

– Working with machine learning techniques and time series analysis

– Working with machine learning libraries in Python (pandas, scikit-learn)

– Working with real-world observations

– Advancing research on an interdisciplinary problem

– Possibility to publish a research paper

Prerequisites:

– Linear algebra

– Machine learning (intermediate skills)

– Python (intermediate skills)

– Interested in interdisciplinary applications

References:

[1] M. Belkin and P. Niyogi, “Laplacian eigenmaps and spectral techniques for embedding and clustering”, NIPS, 2001

[2] R.R. Coifman and S.Lafon, “Diffusion maps”, Applied and computational harmonic analysis, 2006

[3] C.E. Rasmussen and C.K.I. Williams, “Gaussian Processes for Machine Learning”, MIT Press 2006

Contact: Eniko Szekely [email protected]

Laboratory: Swiss Data Science Center

Type: Master Thesis Project

Description:

The goal of this project is to perform classification of more than a 1000 image classes. In order to achieve high scalability, the idea is to generate short descripters from images using contrastive loss and use k-nearest neighbours for the eventual classification.

Goals/benefits:

It is hard to perform classification for more than 1000 categories using CNN’s due to memory and computation limitations. It is an interesting goal therefore to breach this limitation using a simple approach. An added benefit of using descriptors is that new classes may be learned without needed any prior training (zero-shot learning).

Prerequisites:

– Coding in python using Pytorch and/or Tensorflow

– Interested in solving practical problems

Contact: Dorina Thanou, [email protected] & Radhakrishna Achanta, [email protected]

Laboratory: Swiss Data Science Center

Type: Master Thesis Project

Description:

Given the full-body picture of a human, the goal of this project is to segment the human object from the background and also locate the positions of the joints. This in turn will be used for estimating the height and weight of humans from available pictures. As of now such detections can be done using well-known networks called RCNN (for human contours) and OpenPose (for joints). This project aims to perform the two tasks using a single, simpler deep network.

Goals/benefits:

– Knowledge of deep learning for semantic segmentation and region proposal

– Creating a computationally efficient network to perform contour detection and join location of humans.

Prerequisites:

– Coding in python using Pytorch and/or Tensorflow

– Interested in solving practical problems

Contact: Dorina Thanou, [email protected] & Radhakrishna Achanta, [email protected]

Laboratory: Swiss Data Science Center

Type: Master Thesis Project

Description:

The goal of this project is to take arbitrary building images and “parse” the facades into doors, windows and walls. Facade parsing is useful for analysis of building structure by civil engineers.

The parsing will be done using two approaches (which can be combined): pixel-wise image segmentation and rectangular region-of-interest proposal.

Goals/benefits:

– Knowledge of deep learning for semantic segmentation and region proposal

– Help civil engineers analyse building structure automatically

Prerequisites:

– Coding in python using Pytorch and/or Tensorflow

– Interested in solving practical problems

Contact: Radhakrishna Achanta, [email protected]