Ongoing Student Projects

The following projects are currently pursued by students in our lab, and are therefore not available anymore. They are published for reference and inspiration.

Privacy-preserving KNN construction in decentralized recommender systems

Contact: Rafael Pires <[email protected]>
A k-Nearest-Neighbors graph is a widely-used data-structure in Machine Learning for recommendation systems. In a decentralized collaborative filtering system, every node computes its distance to other nodes based on some similarity metric between their profiles (i.e., their preferences with respect to items with which they interacted). Since these profiles correspond to user’s tastes, this is obviously an enormous privacy threat. In this project, we will compare the cost of building the KNN graph in decentralized systems by applying both homomorphic encryption (HE) and private set intersection cardinality (PSI-CA) for computing the Jaccard similarity metric between user profiles. In addition, we will propose improvements and explore other alternatives that are both efficient and privacy-preserving.

Data sharing for tackling non-iidness

Contact: Rafael Pires <[email protected]>

Sharing data can save lots of network resources and it can be implemented in a privacy-preserving manner [Rex]. However, it could do more than that. A recurrent problem in federated and decentralized learning is data heterogeneity. This project aims at identifying strategic data sharing techniques that can quickly attenuate the effects of data heterogeneity, hopefully achieving faster convergence.

Federated and Decentralized learning – layerwise partial model sharing

Contact: Sharma Rishi <[email protected]>

Sparsification algorithms for partial model sharing have been shown to work in federated and decentralized settings to reduce the communication costs. Generally, these algorithms work at the scale of the entire model, not discriminating between the kinds of parameters. The parameters in different layers of deep neural networks like convolutional layers, RNN, and fully-connected layers act very differently to each other. Therefore, the sparsification algorithms can be specialized to adapt to these kinds of layers. This project aims at improving the performance of sparsification algorithms in these settings by specializing them to generally used neural network layers. The project also includes studying the sensitivity of model output to the change in parameters, and how it impacts partial model sharing.

Software Engineering – decentralizepy framework optimization

Contact: Sharma Rishi <[email protected]>

We have a decentralized learning multi-machine runtime framework written in Python using Pytorch and Numpy. There are multiple optimization and refactoring opportunities. In this project, we aim to make decentralizepy faster and more customizable.

Pointers:

  • The evaluation on test set can be performed in parallel to the training steps.
  • Different modules of the framework can be further decoupled to improve customizability.
  • A coordinator script can schedule processes on machines with low utilization.

Necessary Background:

  • Machine Learning
  • Python
  • Pytorch
  • Computer Networks
  • Threads and Processes

Collaborative Inference

Master’s thesis or Master’s semester project: 12 credits
Contact: Akash Dhasade ([email protected])

Recently, collaborative learning (Federated Learning (FL) and Decentralized Learning (DL)) evolved as an attractive alternative to address the need of scalability and data privacy. While attractive, these approaches bear huge communication costs which has been the primary interest of research. The goal of any learning however was singular and unmodified — to achieve good performance on test samples. In this project, we try to seek this goal directly. We consider a setting where nodes are decentralized, possess data, but this time instead of collaborating to train, they collaborate to infer. This shift of purview brings new benefits and reduces significantly the systems cost involved in training. The project concerns investigating techniques and algorithms to perform collaborative inference.