Joint Human Pose Estimation and Stereo 3D Localization ‒ VITA ‐ EPFL

Wenlong Deng, Lorenzo Bertoni, Sven Kreiss, Alexandre Alahi

We present an end-to-end trainable Neural Network architecture for stereo imaging that jointly locates and estimates human body poses in 3D. Our method defines a 2D pose for each human in a stereo pair of images and uses a correlation layer with a composite field to associate each left-right pair of joints. In absence of a stereo pose dataset, we show that we can train our method with synthetic data only and test it on real-world images (i.e., our training stage is domain invariant). Our method is particularly suitable for autonomous vehicles. We achieve state-of-the-art results for the 3D localization task on the challenging real-world KITTI dataset while running four times faster.

Warning

Please note that the publication lists from Infoscience integrated into the EPFL website, lab or people pages are frozen following the launch of the new version of platform. The owners of these pages are invited to recreate their publication list from Infoscience. For any assistance, please consult the Infoscience help or contact support.

Joint Human Pose Estimation and Stereo 3D Localization

W. Deng; L. Bertoni; S. Kreiss; A. Alahi

2020-06-01. International Conference on Robotics and Automation (ICRA), Paris, France. Virtual conference., May 31th, June 4th 2020. p. 2324-2330. DOI : 10.1109/ICRA40945.2020.9197069.

Detailed record

Full text – View at publisher