Model Predictive Control for Constraint Learning from Demonstrations ‒ LASA ‐ EPFL

Type	Semester project
Split	30% theory, 60% implementation, 10% experimentation
Knowledge	Machine learning, Model predictive control; Programming skills: Python
Subjects	Machine Learning, Optimal Control
Supervision	Baiyu Peng
Published	12.01.2026

Unlike traditional Learning from Demonstration (LfD) methods that directly mimic expert trajectories, Constraint Learning from Demonstrations (CLfD) focuses on identifying the underlying constraints of what is allowed to do and what is not allowed to do from the demonstrations, and finally learns a controller that explicitly adheres to the learned constraint.

Previously, we developed the Positive-Unlabeled Constraint Learning (PUCL) algorithm, which recovers the constraint function through a two-step process:

Generate potentially unsafe trajectories using a reinforcement learning (RL) policy.
Identify unsafe states by distinguishing them from safe expert demonstrations and unsafe trajectories.

Our current framework employs an RL policy. Despite its flexibility, RL often requires substantial training time and struggles to obtain a feasible and high-performing policy. In contrast, Model Predictive Control (MPC) is naturally suited for constraint handling, requires no training, and is therefore a promising alternative for trajectory generation in constraint learning.

In this project, the student will modify our existing code and implement an MPC policy to replace the RL policy for constraint learning. The learned constraints will then be used for constraint-aware motion planning in the original task and transferred to similar tasks. The performance of this MPC-based approach will be evaluated and compared with RL-based and Dynamical System (DS)-based methods, both of which have been previously studied and evaluated.

Expectation

Proficiency in Python coding and common libraries (e.g., numpy, torch, matplotlib).
Understanding optimal control and model predictive control.
Understanding classification with neural networks.

Expectation

References