Automatic Optimization Flow for Facebook’s Deep Learning Recommendation Model


Automatic hyperparameter optimization and architecture search of DLRM through Reinforcement Learning to minimize human effort

Team

  Atienza Alonso David
  Zapater Sancho Marina

Research Partners

Facebook Research Facebook


Recommendation systems play a significant role in internet services. With the success of Machine Learning and, in particular, Deep Learning (DL), in many application domains, recommendation systems are moving towards deploying DL-based solutions. One of the most recent DL-based recommendation systems is Facebook's open-sourced Deep Learning Recommendation Model (DLRM). DLRM, similarly to all other DL models, such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), comes with several layers, each with multiple hyperparameters to be tuned to achieve the desirable accuracy and inference time. Nonetheless, neural architecture search and hyperparameter optimization are traditionally performed manually by experts and are very time-consuming.

DLRM architecture
Fig. 1: DLRM architecture

In this project, ESL addresses automatic hyperparameter optimization and architecture search of DLRM through Reinforcement Learning (RL) to minimize human effort. For this purpose, the RL agent necessitates an optimal implementation of the code to truly distinguish the impact of different hyperparameters and architectures.

Therefore, we present several modifications optimizing reading the datasets, dataloader, and CPU-GPU interaction. Our experimental results show that owing to this set of modifications, the training time can decrease by up to 4.38x compared to the original implementation of DLRM. Moreover, our RL-based approach, automatically finds DLRM architectures that achieve the same accuracy (i.e., 88.4%) as the originally manually-optimized DLRM without any degradation in inference time. By using only one Nvidia V100 GPU and requiring only 6 days for the design space exploration, our work shows a promising approach to save manual tuning effort, time, and overall money spent on the optimization process

Overall view of proposed optimizations
Fig. 2: Overall view of proposed optimizations



Codebase: