3D Robust Part-based Object Tracking

Tracking occluded, cluttered objects with parts

Many Robotics and Augmented Reality applications require to to accurately estimate 3D poses of poorly textured, highly occluded objects.

More in particular, we are interested in scenes with poorly textured objects, possibly visible only under heavy occlusions, drastic light changes, and changing background. A depth sensor is not an option in our setup, as the target objects often have specular surfaces. Feature point-based methods also fail because of the lack of texture. These are typical conditions of many Augmented Reality applications.

So, we build a 3D object tracking framework based on the detection and pose estimation of discriminative parts of the target object. Compared to previous work that rely on parts for object detection, our main contribution is a powerful representation of the 3D pose of each part: more in particular, we propose to represent such pose by the 2D reprojections of a small set of 3D control points.

The control points are only “virtual”, in the sense they do not have to correspond to specific image features. This representation is invariant to the part’s image location and only depends on its appearance. We show that a Convolutional Neural Network (CNN) can predict the locations of these reprojections very accurately, and can also be used to predict the uncertainty of these location estimates.

After detecting the parts on the image and predicting their pose representations, we can easily combine the poses and obtain a robust 3D pose for the full target object.

Our approach has several advantages:

• We do not need to assume the parts are planar, as was done in some previous work;

• we can predict the 3D pose of the object even when only one part is visible;

• when several parts are visible, we can easily combine them to compute a better pose of the object;

• the 3D pose we obtain is usually very accurate, even when only few parts (or a single one) are visible.

Extended Demo: Real-time tracking demo. We combined the 3D Part-based Tracker with ORB-SLAM for more stable results. The system is now able to keep track of the target object, even when it is completely occluded!.


Our dataset is available for download at the this webpage.


A Novel Representation of Parts for Accurate 3D Object Detection and Tracking in Monocular Images

A. Crivellaro; M. Rad; Y. Verdie; K. M. Yi; P. Fua et al. 

2015. International Conference on Computer Vision (ICCV), Santiago, Chile, December 13-16, 2015. p. 4391–4399. DOI : 10.1109/ICCV.2015.499.


This project is supported in part by the EDUSAFE and the MAGELLAN European projects.