Swiss Data Science Center
Visit our website

Projects – Spring 2021
It may be possible to convert a thesis project into a semester project or extend a semester project to be suitable for a thesis project. If any of the present or past projects interests you, please feel free to contact us. We are always looking forward to meeting motivated and talented students who want to work on exciting projects.
Laboratory:
Swiss Data Science Center
Type:
Master Project
Description:
Variational autoencoders [1,2] are unsupervised deep learning techniques that learn latent representations of the input data of low dimensionality. Previous works have shown that the latent low-dimensional representations capture the most relevant features in the data which could be used directly for physical understanding, but also as input to other machine learning algorithms, such as clustering, forecasting or extreme event detection. Here, we will mostly focus on understanding the latent representations of physical systems, such as the Lorenz attractor or data representing climate systems. The goal will be to disentangle the representations, such that each feature ideally captures one driver of the dynamics.
Goals/benefits:
- Working with machine learning and deep learning libraries in Python (pandas, scikit-learn, PyTorch)
- Becoming familiar with the analysis of time series (power spectra)
- Advancing research on an interdisciplinary problem
- Possibility to publish a research paper
Prerequisites:
- Machine learning and deep learning (advanced or intermediate skills)
- Python (advanced skills)
- Interested in interdisciplinary applications
Deliverables:
- Well-documented code
- Written report and oral presentation
References:
[1] D. Kingma, M. Welling, “Auto-encoding variational Bayes”, 2013
[2] I. Higgins, D. Amos, D. Pfau, S. Racaniere, L. Matthey, D. Rezende, A. Lerchner, “Towards a definition of disentangled representations”, 2018
Contact:
Eniko Székely ([email protected])
Natasha Tagasovska ([email protected])
Description:
Dynamical systems such as the climate are highly nonlinear, and despite the fact that the observations are high-dimensional, most of the dynamics is captured by a small number of physically meaningful patterns. In this project we will apply unsupervised dimension reduction techniques for feature extraction, and more specifically, the Nonlinear Laplacian Spectral Analysis (NLSA) [1] approach to extract features from potential vorticity at the level of the stratosphere. NLSA uses the information on the past trajectory of the data and thus allows us to extract representative temporal and spatial patterns. We will compare the results with linear techniques such as Principal Component Analysis.
Goals/benefits:
- Working with machine learning libraries in Python (pandas, scikit-learn)
- Advancing research on an interdisciplinary problem
- Possibility to publish a research paper
Prerequisites:
- Linear algebra
- Machine learning (intermediate skills)
- Python (intermediate skills)
- Interested in interdisciplinary applications
Deliverables:
- Well-documented code
- Written report and oral presentation
References:
- D. Giannakis and A.J. Majda. Nonlinear Laplacian Spectral Analysis: Capturing intermittent and low-frequency spatiotemporal patterns in high-dimensional data, Statistical Learning and Data Analysis, 2012
- E. Székely, D. Giannakis, A.J. Majda. Extraction and predictability of coherent intraseasonal signals in infrared brightness temperature data, Climate Dynamics, 2016
- M. Belkin and P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, NeurIPS, 2001
Contact:
Eniko Szekely: [email protected]
Raphaël de Fondeville: [email protected]