3D Tracking Dataset ‒ CVLAB ‐ EPFL

frame from a multiview human tracking dataset

Tracking is a classical task of computer vision. Recent advances of deep learning has shifted the field toward a tracking-by-detection paradigm and boosted the performances. Today, due to availability of the data the field is moving toward challenging crowed monocular task. In such crowed scenario multiview model shine, however, due to the lack of large-scale multiview dataset progress is limited.

In this project we aim to remove this limitation by providing a new multiview dataset for tracking. In particular we are targeting the task of 3D tracking and long-term tracking of people. The first step toward that goal will be to build a multi-camera dataset suitable for those tasks. Initially the student will design a multi-view annotation tools that will leverage camera calibration to minimize annotation cost. Once this is completed, the student will participate in the data collection process. If times allow the project will end with the design of the baseline deep learning architecture to tackle the 3D tracking task in an end-to-end manner.

Prerequisites

The candidate should have Python programming experience. Previous experience with deep learning and PyTorch, is a plus. Knowledge about computer vision and camera model is recommended.

Contact

Martin Engilberge – [email protected]