Motivation
Modern approaches to robot learning, such as Action Chunking Transformers (ACT) [1] show improvement on traditional imitation learning methods, achieving high success rates in complicated tasks. However, they lack safety guarantees and the possibility to react to unexpected obstacles. Methods that modify the robot’s path to avoid obstacles may lead to unseen states during training, causing task failure or unpredictable behavior. Path-Consistent Safety (PACS) [2] proposes a method to add safety guarantees to a learned policy while ensuring the robot remains within the known state distribution.
Outline
This project aims to implement and test in simulation a safe Imitation Learning framework. Specifically, combining Action Chunking with Safety Filters to handle obstacles. We will validate this approach in NVIDIA Isaac Lab by training a simple behavior cloning policy that outputs “chunks” of actions. Once the baseline policy is trained, it can be expanded by integrating a higher-level safety filter to guarantee collision avoidance during task execution. The implementation of the safety component is open-ended, and the student can explore different strategies.
Milestones
- M1 (Weeks 1–2): Literature review and setup of the NVIDIA Isaac Lab simulation environment.
- M2 (Weeks 3–6): Data Generation and behavior cloning training. Train a standard RL expert to solve a manipulation task (e.g., Isaac Lab’s “open cabinet/drawer”) and distill this into a Behavior Cloning (BC) policy with Action Chunking.
- M3 (Weeks 7–10): Safety Implementation. Implement a safety filter that analyzes the robot’s trajectory to detect potential collisions with obstacles, while preserving task success.
- M4 (Weeks 11–14): Evaluation and comparison with baselines. Evaluate the system in a custom simulation environment with obstacles and compare safety metrics with baseline “unsafe” policies.
Requirements
We look for motivated students with a good background in Python coding and machine learning. Some experience with robotics simulation (Isaac Sim, MuJoCo) or Reinforcement Learning is a plus but not strictly required.
We welcome students with a strong interest in the theoretical aspects of safe RL and Imitation Learning, therefore the direction of the project is flexible: implementation of the key components is necessary, but the scope can be adapted to focus more on theoretical guarantees depending on the student’s preference.
If you are interested, please send an email containing: Your CV, one paragraph on your background and fit for the project; your BS and MS transcripts to [email protected], [email protected].
References
[1] Zhao, Tony Z., et al. “Learning fine-grained bimanual manipulation with low-cost hardware.” arXiv preprint arXiv:2304.13705 (2023).
[2] Römer, Ralf, et al. “From Demonstrations to Safe Deployment: Path-Consistent Safety Filtering for Diffusion Policies.” arXiv preprint arXiv:2511.06385 (2025).
