Ongoing Student Projects

The following projects are currently pursued by students in our lab, and are therefore not available anymore. They are published for reference and inspiration.

GeL in centralized setting

Contact: Dhasade Akash Balasaheb <[email protected]>

The GeL algorithm leverages local momentum of clients in federated learning (FL) to perform additional learning through guessed steps and speeds up convergence. While GeL was primarily designed for FL, the centralized settings can also benefit from a guessing mechanism. In these settings, the mathematics of GeL reveal that the update rule has an interesting connection to Nesterov momentum i.e it looks alike to Nesterov momentum applied once in multiple steps. An empirical assessment of this connection and a successful application to centralized settings will also enable us to apply GeL on the server side of FL. In summary, this project has the following goals:

  1. Explore the connection to Nesterov momentum in centralized setups
  2. Server side guessing in FL
  3. Study GeL under IID data

Rex – Other ML problems

Contact: Rafael Pires <[email protected]>

Rex is a decentralized recommender system that leverages Trusted Execution Environments (TEEs) to enable data sharing and achieve faster convergence. Based on the privacy guarantees of TEEs, the idea of data sharing is more general and can be applied beyond recommender systems. This project envisions a Rex++ system that can train ML models for a variety of learning tasks including image classification, text prediction, generative modeling, etc. However, for these tasks, data cannot be blindly shared since the size per data sample (image or text) is much higher than for a data sample in recommenders (3-tuple of <user, item, rating>). This demands additional techniques, for example, using an auto-encoder to compress images. Finally, we would also like to study if data sharing and model sharing can be utilized together, adaptively switching from one to another based on the underlying resource constraints.

Data sharing for tackling non-iidness

Contact: Rafael Pires <[email protected]>

Sharing data can save lots of network resources and it can be implemented in a privacy-preserving manner [Rex]. However, it could do more than that. A recurrent problem in federated and decentralized learning is data heterogeneity. This project aims at identifying strategic data sharing techniques that can quickly attenuate the effects of data heterogeneity, hopefully achieving faster convergence.

Decentralized learning – pipelining computation and communication

Contact: Rafael Pires <[email protected]>

Data sharing can boost system performance in decentralized learning systems. Moreover, data sharing is independent of the training tasks. Therefore, these two tasks are trivially parallelizable. This project aims at investigating efficient ways to pipeline data sharing with training tasks in order to speed up training in decentralized learning.

Federated and Decentralized learning – layerwise partial model sharing

Contact: Sharma Rishi <[email protected]>

Sparsification algorithms for partial model sharing have been shown to work in federated and decentralized settings to reduce the communication costs. Generally, these algorithms work at the scale of the entire model, not discriminating between the kinds of parameters. The parameters in different layers of deep neural networks like convolutional layers, RNN, and fully-connected layers act very differently to each other. Therefore, the sparsification algorithms can be specialized to adapt to these kinds of layers. This project aims at improving the performance of sparsification algorithms in these settings by specializing them to generally used neural network layers. The project also includes studying the sensitivity of model output to the change in parameters, and how it impacts partial model sharing.

Software Engineering – decentralizepy framework optimization

Contact: Sharma Rishi <[email protected]>

We have a decentralized learning multi-machine runtime framework written in Python using Pytorch and Numpy. There are multiple optimization and refactoring opportunities. In this project, we aim to make decentralizepy faster and more customizable.

Pointers:

  • The evaluation on test set can be performed in parallel to the training steps.
  • Different modules of the framework can be further decoupled to improve customizability.
  • A coordinator script can schedule processes on machines with low utilization.

Necessary Background:

  • Machine Learning
  • Python
  • Pytorch
  • Computer Networks
  • Threads and Processes