Student projects

At LIONS, we are concerned with optimized information extraction from signals or data volumes. We therefore develop mathematical theory and computational methods for information recovery from highly incomplete data.

We divide our research into two synergistic theory thrusts: information scalable optimization and data acquisition, and learning theory and methods for low-dimensional signal models.

Our team of Postdocs and Phd students work on different aspects of these interesting and challenging problems, from theoretical analysis of methods and algorithms to efficient implementations to solve real-world problems.

Choosing our laboratory for your project will enable you to work in an exciting and stimulating environment, allow you to quickly learn state-of-the-art mathemathical and algorithmic tools and deliver an outstanding project.

Below, you can find some project that our team is currently working on and would appreciate your collaboration. You are also welcome to come with your own project topic that matches with our expertise and interests.

Please contact Prof. Volkan Cevher for further information on the projects.

Below projects are mainly reserved for Master students. In some  cases they could also apply to Bachelor students.  Please contact directly each of the projects’ supervisor for details.  PhD students should consult LIONS open position page.

Open Projects

The collection and annotation of massive datasets (e.g. Cityscapes, ImageNet) has enabled vast progress in the challenging tasks (e.g. semantic segmentation, classification). The collection of such datasets is a gargantuan task, which is even impossible in some cases. For instance, in autonomous driving it is impossible to collect a dataset with every scene that the underlying model will face. Instead, we can rely on (unsupervised) domain adaptation [1]. The idea is to find a `related’ dataset that can act as the source domain. Then, we can use the annotated source dataset to learn the labels for the target domain we are interested in. Then, we can use the labels provided by the source domain to label the data in the target domain and thus learn the downstream task (e.g. semantic segmentation). However, the downstream task is prone to noise [2] in both the source domain and more importantly the target domain. The goal of this project is to develop techniques for making the downstream task robust to different types of noise [3, 4]. 

[1] Wang, M., & Deng, W. (2018). “Deep visual domain adaptation”: A survey. Neurocomputing312, 135-153.

[2]  Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014). “Intriguing properties of neural networks”. ICLR.

[3] Chrysos, G. G., Kossaifi, J., & Zafeiriou, S. (2020). “RoCGAN: Robust Conditional GAN”. International Journal of Computer Vision128(10), 2665-2683.

[4] Laidlaw, C., Singla, S., & Feizi, S. (2021). “Perceptual adversarial robustness: Defense against unseen threat models”. ICLR.

The project can be  for both a bachelor or master thesis or for a semester project. Max 1-2 people. A student should specialise in Computer science and/or Data science.

If interested please contact Dr Grigorios Chrysos for details.

Synthesizing high dimensional signals, such as image or audio, has been a long standing problem in machine learning. On one hand, tremendous progress has been observed on the task of image synthesis [1-3]. On the other hand, synthesizing different types of signals, e.g. audio, has received little attention in the last few years. Generative Adversarial Nets (GANs) [1], which have exhibited stellar performance on image generation [2, 3], have also been partially used for audio synthesis [4]. However, this work considers the time-frequency representation of the signal (e.g. Short Time Fourier Transform) collapses the inherently complex information into a two-channel image. Our goal in this work is instead to develop a method that works in the complex domain. The project will involve a fair amount of modelling; familiarity with both complex analysis and neural networks is required. This model is expected to generalise beyond audio synthesis in tasks that have inherently complex computations, e.g. tasks that rely on Fourier analysis. 

[1] Goodfellow, I., et al. “Generative adversarial nets.”, NIPS 2014.

[2] Karras, T., et al. “A style-based generator architecture for generative adversarial networks.”, CVPR 2019.

[3] Chrysos, G., et al. “Deep Polynomial Neural Networks.”, T-PAMI 2021.

[4] Engel, Jesse, et al. “Gansynth: Adversarial neural audio synthesis.”, ICLR 2019.

The project can be  for both a bachelor or master thesis or for a semester project. It is an individual project.  A student should specialise in Computer science and/or Data science.

If interested please contact Dr Grigorios Chrysos for details.
Even though multiplicative and higher-order correlations are ubiquitous in machine learning and related fields, their inductive biases are not well understood yet. Higher-order polynomials have demonstrated success when trained on specific tasks, however what is the inductive bias of the polynomial structure when untrained? The seminal study of ‘Deep Image Prior’ [1] sheds light on feed-forward neural networks, however a related study is yet to be conducted for polynomial neural networks. The study of ‘Multiplicative Interactions and Where to Find Them’ [2] proves that second-order interactions enlarge the hypothesis space of feed-forward networks, hence the untrained polynomial networks should substantially improve the results of untrained feed-forward networks. An additional focus of the project is to illustrate the different inductive biases of polynomial network variants [3]. 
 
[1] ‘Deep Image Prior’, Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky, CVPR 2018.
[2] ‘Multiplicative Interactions and Where to Find Them’, Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack Rae, Simon Osindero, Yee Whye Teh, Tim Harley, Razvan Pascanu, ICLR 2020.
[3] ‘Π−nets: Deep Polynomial Neural Networks’, Grigorios G. Chrysos, Stylianos Moschoglou, Giorgos Bouritsas, Yannis Panagakis, Jiankang Deng, Stefanos Zafeiriou, CVPR 2020.
 
If interested please contact Dr Grigorios Chrysos for details.
Using polynomial expansion, such Taylor series, is a well-established tool in functional analysis for approximating functions. Even though in machine learning, higher-order correlations (and even polynomial expansions) have emerged for approximating functions, their properties have not been studied yet. Specifically: 
a) the initialization of the network is a crucial component for learning (deep) feed-forward neural networks [1], [2]. There has been no theoretical analysis to date on the initialization of polynomial neural networks.
b) terms of high-degree might be unstable outside the [-1, 1] value range. A study of the stability in such cases will be significant. New normalization schemes [3] might be beneficial for polynomial neural networks. 
 
[1] ‘Understanding the difficulty of training deep feedforward neural networks’, Xavier Glorot, Yoshua Bengio, AISTATS 2010.
[2] ‘Delving deep into rectifiers: Surpassing human-level performance on imagenet classification’, Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, ICCV 2015.

[3] ‘Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift’, Sergey Ioffe, Christian Szegedy, ICML 2015.

If interested please contact Dr Grigorios Chrysos for details.

In a distributed setting, gradients are compressed to reduce communication costs. Examples of compression methods are signSGD, QSGD, and NUQSGD. Over the network, transmitted bits (e.g., sign of each gradient coordinate sent by a compute node) can be flipped either due to network unreliability or due to manipulation by an adversary. The most basic scenario is a parameter-server setting where the uplink is noisy (the adversary changes some bits randomly). A fundamental question is how many extra bits (redundancy as a price for robustness) we should add to have a bounded variance in terms of aggregated updates considering two sources of randomness (mini-batch and uncertainty over networks).

This semester  project is available for master students only. It is an individual project.  A student should mainly specialise in Cyber security and Communication systems.

If interested please contact Dr Ali Ramezani for details.

Minimax settings have attracted significant attention due to the success of generative adversarial networks (GANs) and adversarial training. While the underlying optimization algorithms have been studied extensively in terms of their convergence (optimization error), their generalization to an unseen example, remained an open problem. Generalization error depends on the design of the learning algorithm and the characteristics of the loss function. Unlike generalization analyses which are based on the complexity of the hypothesis class, the stability analysis focuses on how the learning algorithm explores the hypothesis class. Two main stability notions in the literature are uniform and on-average stability. Developing stability bounds for minimax settings is an important problem in machine learning.

This semester  project is available for master students only. It is an individual project.  A student should mainly specialise in Computer Science but it could  also be avaialble for other fields (Cyber security, Communication systems and/or Data Science).

If interested please contact Dr Ali Ramezani for details.

Recently the better generalisation capabilities of ensembles of independently trained networks were explained by the networks learning distinct high level features [1]. However, it remains an open problem whether training a single network can achieve the same accuracy without distillation. Dropout provides one method of implicitly training an ensemble but is insufficient in its vanilla stochastic variant. This project intends to investigate whether an adversarial dropout scheme, where adversarially picked features are dropped during training, can provide a stronger ensemble. This follows the growing line of work employing minmax dynamics to train more robust models. Familiarity with PyTorch and training neural networks is recommended.

[1] Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning, Allen-Zhu and Li 2020

This semester  project is available for master students only. It is an individual project.  A student should specialise in Computer science, Cyber security, Communication systems and/or Data Science.

If interested please contact Thomas Pethick for details.

The double descent phenomenon is well studied in classification and is largely explained away by the so called interpolation regime [1,2]. However, the role of the interpolation regime is less lucid in Generative Adversarial Networks (GANs) both because of the interaction due to the minmax structure but also the stochasticity of the latent variable. This project aims to empirically investigate the effect of the interpolation capacity of both the discriminator and generator. Familiarity with PyTorch and training neural networks is recommended.

[1] Reconciling modern machine learning practice and the bias-variance trade-off, Belkin et al. 2019

[2] Deep Double Descent: Where Bigger Models and More Data Hurt, Nakkiran et al. 2019
 

This semester  project is available for master students only. It is an individual project.  Student should specialise in Computer science, Cyber security, Communication systems and/or Data Science.

If interested please contact Thomas Pethick for details.

Langevin dynamics is a first-order sampling algorithm that has gained a lot of attention in the last years. Its similarity to gradient descent in optimisation makes us wonder whether we can transpose existing technique in optimization to sampling. In particular, we would like to design an adaptive version of Langevin dynamics, in the similar fashion as AdaGrad, and prove convergence guarantees with similar improvement as in optimization.

If interested please contact Paul Rolland for details.

This project concerns  the implementation of a differentiable spice
compatible circuit simulator in order to enable direct optimisation of
circuit parameters through gradient descent.
 
Required skills are a solid background in modern C++ programming and a willingness to learn Rust.
 

The project can be  for both a bachelor or master thesis or for a semester project.  A student should specialise in Computer science, Cyber security, Communication systems and/or Data Science.

If interested please contact Igor Krawczuk for details.
 
Adversarial training has had great success to achieve robustness in supervised neural networks as well as achieving sim2real transfer for deep RL. However, constraining the adversary in a way that yields robust and useful models is challenging, since the standard Lp or divergence based adversary constrains are difficult to interpret in real world dynamics. This project focuses on constraining the adversary directly via an interpretable surrogate model in order to yield robustness in real world settings.
 
Familiarity with pytorch, tensorflow, jax or julia required.
 

The project can be  for both a bachelor or master thesis or for a semester project.  A student should specialise in Computer science, Cyber security, Communication systems and/or Data Science.

If interested please contact Igor Krawczuk for details.
Invertible neural networks offer a way to study the latent variables and improve interpretability by ensuring all information found in the input is preserved. This project will study invertible dimensionality reduction methods as a means to gain deeper insights into the invariances and symmetries learned by deep neural networks.
 
Familiarity with pytorch, jax or julia required.
 

The project can be  for both a bachelor or master thesis or for a semester project.  A student should specialise in Computer science, Cyber security, Communication systems and/or Data Science.

If interested please contact Igor Krawczuk for details.
Sample efficiency and exploration are two key challenges in the use of RL for real world problems. Projects in this research direction will focus on leveraging deep neural networks to improve sample efficiency by planning trajectories for exploration and exploitation.
 
Familiarity with RL, and pytorch, tensorflow, jax or julia required,
Rust/C++ and GNN familiarity are nice-to-haves..
 

The project can be  for both a bachelor or master thesis or for a semester project.  A student should specialise in Computer science, Cyber security, Communication systems and/or Data Science.

If interested please contact Igor Krawczuk for details.
The training dynamics of deep neural networks are an open field of research. Projects in this direction will use techniques from data compression, causal inference, dynamical systems and optimisation theory to design benchmarks for , collect datasets on and study the evolution of parameters, gradients and other relevant information during the optimisation process with the goal to enable a more in depth understanding.
 
Familiarity with pytorch, tensorflow, jax or julia required, experiences with Rust, dynamical systems/optimisation analysis,compression are nice-to-haves.
 

The project can be  for both a bachelor or master thesis or for a semester project.  A student should specialise in Computer science, Cyber security, Communication systems and/or Data Science.

If interested please contact Igor Krawczuk for details.
Combinatorial and discrete optimisation sadly combine the properties of:  I) often being NP or NP-hard II) being ubiquitous in real world engineering and decision making problems like satisfiability, travelling salesmen, knapsack etc. This line of research explores the use of learned data driven heuristics and algorithms on these problems, with applications focused on circuit design, experimental design and multi-agent games.
 
Familiarity with (pytorch,jax and/or Julia) and (Rust and/or C++) required,as well as a basic knowledge of optimisation and graph theory.
 

The project can be  for both a bachelor or master thesis or for a semester project.  A student should specialise in Computer science, Cyber security, Communication systems and/or Data Science.

If interested please contact Igor Krawczuk for details.

Machine learning classifiers working on high dimensional data exhibit susceptibility to so called “adversarial examples”, wildly misidentifying inputs presented to them when modified with specifically crafted  small perturbations. While part of this behavior has been explained by the concentration of measure phenomenon and types of features learned by the classifiers, there seem to be additional drivers of adversarial risk not yet captured by current frameworks. This project – which can be adapted to either Masters or Semester projects – investigates these drivers, aiming at mathematical analysis, the creation of toy problems and benchmarks which can be used to gauge the sources of adversarial susceptibility a given model contains, as well as their severity. This is a challenging project and for both master or semester projects, you should be comfortable with Julia, pytorch or tensorflow, as well as have general background on the field.

If interested please contact Igor Krawczuk for details

Characterising novel circuit elements and developing compact behavioural models is an important step before they can be used in circuit design. However creating and tuning these models can be a tedious process. This project is a first exploration of using a SPIRAL inspired algorithm for the synthesis of compact models by synthesising SPICE circuit models.

If interested please contact Igor Krawczuk for details

This is part of a large scale project concerned with studying way to fully automate the design and layouting of ASICS. Possible projects touch on reinforcement learning, generative models, representation learning and theoretical work on how to represent and decompose action and state spaces.

If interested please contact Igor Krawczuk for details

Neuromorphic  machine  learning  hardware  uses device physics  like  the  switching  dynamics  of memristive devices or the charge dynamics of a capacitor to perform computation. To estimate the efficacy of neuromorphic machine learning hardware, one needs to marry together two computationally intensive tasks: we need to run machine learning algorithms on problems  representative of those faced in reality, while at the same time modeling the device physics used to perform the computations. This is somewhat unique workload since the total  training  time  will  be  orders  of  magnitude  longer  than  what  is  easily  feasible  in  circuit simulations, while possibly depending on the interactions of varying subsets of individual devices. This means that we will need to mix both abstract and detailed models, since global abstraction into behavioral or probabilistic models is risky.  To accommodate this special need, we  are developing a multi-model simulation framework.

This framework is is intended to minimize the computational load for the physical simulations by properly adapting the model to the desired accuracy and scale well onto GPU and cluster environments. In this project you will contribute to this framework by collecting existing device  models  for memristive devices from the literature and implement them as compute kernels. The project complexity can be scaled to both Master and Semester projects. Estimated workload breakdown: 

30 % literature survey

50 % model selection and implementation in Julia 

20 % Integration into framework and verification

If interested please contact Igor Krawczuk for details

Neural network models are increasingly deployed in safety critical and data sensitive  settings, for example in medical diagnostics. This project concerns the study of the interactions of various strategies to learn disentangled latent representations with interpretability and techniques meant to help user understand and reason about the features used by the network to give a certain output, their sensitivity to various forms of adversarial perturbations and distributional shifts as well as their susceptibility to attacks like the model inversion attack which is a concern when private data is used to train widely distributed models.

If interested please contact Igor Krawczuk for details

The project can be for a master thesis or for a semester project.

Details of the project can be found here 

If interested please contact Thomas Sanchez for details

The project can be for a master thesis or for a semester project.

Details of the project can be found here 

If interested please contact Thomas Sanchez for details


Closed projects

This projects aims to create a Python library that collects many real world variational inequality (saddle point problem) instances and some popular solvers.

If interested please contact Yurii Malitskyi for details

It has been empirically shown that carefully ordering the training examples improve the learning speed of SGD based algorithms. There very few works that attempt to provide some theoretical insights into this phenomenon. In this project, we aim to investigate (both theoretically and empirically) the curriculum strategies for adaptive optimization methods. 

If interested please contact Parameswaran Kamalaruban for details.

This projects aims to find a robust optimization method that will perform well in many scenarious for deep learning. A student should be already familiar how to train neural networks.

If interested please contact Yurii Malitskyi for details

Bundle methods for non-smooth optimisation were widely used in the literature [1,2,3]. The main idea of this method is to construct a piecewise linear approximation to the objective function. This construction is based on the sub-gradient of the previous iteration.

The objective of this project is to develop stochastic bundle methods and investigate their convergence, for solving non-smooth optimization of minimizing the sum of two functions f+g, where f is proximable and g can be decomposition into the finite sum of g_i.

References:

[1] Yu Du, Andrzej Ruszczynski, Rate of Convergence of the Bundle Method

[2] Alexander J. Smola, S.V. N. Vishwanathan, Quoc V. Le, Bundle Methods for Machine Learning.

[3] C. H. Teo, S.V. N. Vishwanathanm A. Smola, Quoc. V.le, Bundle Methods for Regularized Risk Minimization, Journal of Machine Learning Research 11 (2010) , 311-365

The high-throughput screening of large databases of novel materials candidates constitutes a central goal of the recently awarded MARVEL NCCR grant. Given a training set of compounds with pre-calculated quantum mechanical properties, we seek to construct supervised machine learning models that accurately infer the corresponding properties for similar materials with correctness guarantees [1-4].

References

[1] A. P. Bartók, R. Kondor and G. Csányi , On representing chemical environments, Phys. Rev.., vol. 87, pp. 184115-1–184115-16, 2013.

[2] S. De, A. P. Bartók, G. Csányi and M. Ceriotti, Comparing molecules and solids across structural and alchemical space, Phys. Chem. Chem. Phys., vol. 20, pp. 13754–13769, 2016.

[3] Y. Nesterov, Introductory lectures on convex optimization. A basic course. Kluwer Academic Publishers,Boston, MA, 2004.

[4] M. Rupp, Machine Learning for Quantum Mechanics in a Nutshell, Quantum Chemistry, vol. 115(16), pp. 1058–1073, 2015.

We consider the problem of finding a global maximizer (or minimizer) of an unknown objective function from point evaluations. Bayesian optimization prescribes a prior belief over the possible objective functions and then sequentially refines this model as data are observed.

The student will develop new methods for Bayesian optimization that are both theoretically well founded and practically applicable.The methods will be tested on various real-world problems such as finding a good set of hyperparameters for deep neural networks or discovering molecules with good chemical properties.

Keywords: Bayesian optimization, Gaussian process, Multi-armed bandits

Requirements: Python, machine learning, familiarity with multi-armed bandits and submodular optimization is a plus.

References: [4], [1], [3], [2]

References

[1] Ilija Bogunovic, Jonathan Scarlett, and Volkan Cevher. Time-Varying Gaussian Process Bandit Optimization. In International Conference on Artifcial Intelligence and Statistics (AISTATS), 2016.

[2] Ilija Bogunovic, Jonathan Scarlett, Andreas Krause, and Volkan Cevher. Truncated Variance Reduction: A Uni_ed Approach to Bayesian Optimization and Level-Set Estimation. In Conference on Neural Information Processing Systems (NIPS), 2016.

[3] Josip Djolonga, Andreas Krause, and Volkan Cevher. High-dimensional gaussian process bandits. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 1025-1033. Curran Associates, Inc.,2013.

[4] Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando de Freitas. Taking the human out of the loop: A review of bayesian optimization. Proceedings of the IEEE, 104(1):148-175, 2016.

Submodular functions are set functions, i.e., functions defined on the power set 2V of a given ground set V,, that satisfy a diminishing return property. They occur naturally in several applications in computer vision, machine learning, economics and theoretical computer science. Submodular functions have very interesting links to convex analysis and can be minimized exactly and maximized approximately by efficient algorithms with provable guarantees.

The objective of this project is to develop methods to improve on the theoretical and/or practical efficiency of submodular minimization/maximization algorithms in offline or streaming settings.

Requirements: MATLAB, familiarity with Submodular and Convex Optimization, and Machine Learning.

References:

[1] Fujishige, Satoru. \Submodular functions and optimization”. Vol. 58. El-sevier, 2005.

[2] Bach, Francis. \Learning with submodular functions: A convex optimization perspective.” arXiv preprint arXiv:1111.6453 (2011).

[3] Iyer, Rishabh, and Je_ Bilmes. \Polyhedral aspects of Submodularity, Convexity and Concavity.” arXiv preprint arXiv:1506.07329 (2015).

  1. Low-power analog-to-digital conversion for multichannel neural recording
  2. Hardware-friendly selective feature extractors for neural classification tasks
  3. Ultra-wideband (UWB) wireless data transfer for neural recording implants

If interested, please submit your applications to [email protected] with your CV, statement of results and a short paragraph describing your motivation and background to undertake the project.

Details of the above projects can be found here

We are interested in reducing the scan times in Magnetic Resonance Imaging (MRI) using compressive sensing techniques. For this purpose, we will design sampling mechanisms via a new learning-based approach, which unifies learning theory and the sampling theory. We also consider extensions to multi-coil MRI and provide demonstrations with real scans.

Keywords: Compressive Sensing, Fourier Transform, Image Reconstruction.

Requirements: MATLAB, Signal/Image Processing knowledge, convex and combinatorial optimization

In the inverse reinforcement learning (IRL) setting, the reward of the environment is not defined. Instead, the agent is provided with a set of expert demonstrations, and the agent tries to recover a reward function such that the expert behavior is optimal. Typically in IRL, the transition dynamics of the environment are assumed to be known. And only a few recent works attempt to learn both the reward and the transition dynamics via little interaction with the environment or in a truly-batch manner. In this project, we revisit this problem more systematically in light of new advancements in model-based RL. 

Person of  contact was Parameswaran Kamalaruban

In the inverse reinforcement learning (IRL) setting, the reward of the environment is not defined. Instead, the agent is provided with a set of expert demonstrations, and the agent tries to recover a reward function such that the expert behavior is optimal. Consider a setting where expert demonstrations are provided for only a few environments. Based on these demonstrations, we aim to learn a policy that is robust against a larger class of environments (MDPs).

Person of contact was  Parameswaran Kamalaruban

With the massive increase of available data for machine learning research, designing optimization algorithms that could handle large volumes of data is getting more and more important. For that purpose, we are interested in developing convex optimization algorithms that can be used in distributed environment.

In this project, the algorithms will be implemented and tested using Google’s machine learning library TensorFlow.

Keywords: convex optimization, distributed optimization, tensorflow

Requirements: Python, C++, familiarity with optimization is a plus

Details of the project can be found here: Neural Network Based Hardware Decoder Design

“Although this may seem a paradox, all exact science is dominated by the idea of approximation.”

–Bertrand Russell (1872–1970)

We are interested in developing approximation algorithms with provable estimation guarantees for structured sparse representations of spatial images and temporal data streams.

Students with experience in Hadoop MapReduce and/or Cuda programming are given priority.

Keywords: approximation guarantee, dynamic programming, greedy algorithm, submodularity, union-of-subspace models, randomized algorithms

Requirements: Matlab, C/C++

We are interested in developing algorithms that enable robust decompositions of large-scale data matrices for learning applications.

Keywords: randomized algorithms, fixed-rank approximation problem, approximation guarantees

Requirements: Matlab or C/C++

We are interested in studying non-iid, parametric and non-parametric prior distributions with almost sure compressibility properties. This project pays particular attention to exchangeable distributions defined over binary and quad-trees.

Keywords: compressible priors, order statistics, almost sure convergence, De Finetti’s theorem

Requirements: Matlab, C/C++

We are interested in the application of extremal combinatorial objects in compressive sensing problems, such as randomness extractors and expanders. This project pays particular attention to a recent development in information theory: polar codes.

Keywords: randomness extractors, expanders, polar codes, compressive sensing

Requirements: Matlab or C/C++

We are interested in characterizing the (set) restricted singular values of pseudo-random matrices.

Keywords: matrix martingales, tail bounds, concentration-of-measures

Requirements: Matlab, C/C++

We are interested in making inferences from biological data in order to understand the processes that cause a disease, potentially leading to the identification of therapies.

Within the machine learning framework of supervised linear regression, we want to exploit structured sparse models to both improve prediction performance and interpretability of the results. Structured sparse model encompass models such as hierarchical trees, group structures or union of subspaces model, which offer the possibility of estimating a linear model that can be more easily interpreted by clinicians.

Our aim is to develop and characterize new algorithms and structures to be applied to cancer data and neuroimaging data. This project had both theoretical and computational aspects.

In the classical Markowitz portfolio optimization, one is interested in allocating a fixed amount of capital among a set of assets so as to minimize the risk/variance of the portfolio return, subject to a target empirical return. Assuming that the true mean and covariance of the set of assets are known, the Markowitz portfolio problem has a closed form solution. In practice though, this assumption is unrealistic and we are usually restricted to some empirical estimates of the mean and the covariance from recorded data. Unfortunately, such compromise introduces several problems: in general, financial data are well-known to be non-stationary (which limits the amount of meaningful data in the estimates) while classic ML techniques are robust only in the asymptotic sense where a vast amount of data is required for a good estimate. We aim to improve on these approaches by exploiting other properties of the data, such as the sparsity in the covariance matrix or other low-dimensional assumptions.

During this project, the student learn modern techniques in sparse covariance matrix estimation as well as new scalable optimization algorithms. Our aim is to develop and characterize covariance estimators from a limited amount of data, leveraging sparse promoting and other low-dimensional models in the estimation. As part of the project, the student applies this knowledge to real world data and track progress on the performance of the designed portfolios. This project has both theoretical and computational aspects.

Recovering a large matrix from a small subset of its entries is a challenging problem arising in many real applications, such as recommender systems. Many existing approaches formulate this problem as a general low-rank matrix approximation problem, which may be used to impute unobserved entries. This approach leads to good performance even when very few are known.

We are interested in bandit optimization in this setting which introduces the exploration/exploitation tradeoff, where we actively decide on the sampling positions to optimize a given convex criteria.

We propose a dimensionality reducing matrix design based on training data with constraints on its Frobenius norm and number of rows. Our design criteria is aimed at preserving the distances between the data points in the dimensionality reduced space as much as possible relative to their distances in original data space. This approach can be considered as a deterministic Bi-Lipschitz embedding of the data points.

We introduce a scalable learning algorithm, dubbed AMUSE, and provide a rigorous estimation guarantee by leveraging game theoretic tools. We also provide a generalization characterization of our matrix based on our sample data. We use compressive sensing problems as an example application of our problem, where the Frobenius norm design constraint translates into the sensing energy.

Link to publication.

We consider the problem of solving large-scale optimization problems with matrix decision variables. For such problems, classical convex optimization algorithms scale poorly even at moderate data sizes. On the other hand, through cleverly employing the structures of real-world problems, non-convex alternating minimization methods succeed at providing efficient and accurate solutions, albeit sometimes without theoretical guarantees.

In this project, the student will develop novel alternating minimization algorithms, and examine their theoretical convergence. The student will test the applicability of the new methods on various real-world problems including phase retrieval and matrix completion.

Keywords: Alternating minimization, Phase retrieval, Matrix completion.

Requirements: Matlab or Python, basic knowledge in optimization such as gradient descent and its convergence rate, familiarity with phase retrieval or matrix completion is a plus.

References: [1], [2], [3], [4].

References

[1] Emmanuel J Candes, Xiaodong Li, and Mahdi Soltanolkotabi. Phase retrieval via wirtinger ow: Theory and algorithms. IEEE Transactions on Information Theory, 61(4):1985-2007, 2015.

[2] David E Carlson, Volkan Cevher, and Lawrence Carin. Stochastic spectral descent for restricted boltzmann machines. In AISTATS, 2015.

[3] Moritz Hardt. Understanding alternating minimization for matrix completion. In Foundations of Computer Science (FOCS), 2014 IEEE 55th Annual Symposium on, pages 651-660. IEEE, 2014.

[4] Prateek Jain, Praneeth Netrapalli, and Sujay Sanghavi. Low-rank matrix completion using alternating minimization. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 665-674. ACM, 2013.