Joint Human Pose Estimation and Stereo 3D Localization

Wenlong Deng, Lorenzo Bertoni, Sven Kreiss, Alexandre Alahi

We present an end-to-end trainable Neural Network architecture for stereo imaging that jointly locates and estimates human body poses in 3D. Our method defines a 2D pose for each human in a stereo pair of images and uses a correlation layer with a composite field to associate each left-right pair of joints. In absence of a stereo pose dataset, we show that we can train our method with synthetic data only and test it on real-world images (i.e., our training stage is domain invariant). Our method is particularly suitable for autonomous vehicles. We achieve state-of-the-art results for the 3D localization task on the challenging real-world KITTI dataset while running four times faster.

 

Joint Human Pose Estimation and Stereo 3D Localization

W. Deng; L. Bertoni; S. Kreiss; A. Alahi 

2020-06-01. International Conference on Robotics and Automation (ICRA), Paris, France. Virtual conference., May 31th, June 4th 2020. p. 2324-2330. DOI : 10.1109/ICRA40945.2020.9197069.