

Tracking is a classical task of computer vision. Recent advances of deep learning has shifted the field toward a tracking-by-detection paradigm and boosted the performances. Mostly thank to the availability of large monocular datasets such as the one from MOTChallenge. On the other hand, tracking in a multiview context is lagging behind due to the limited availability of data and the high cost to capture and annotate new datasets.
In this project we aim to remove this limitation by generating a synthetic dataset instead. The goal of the project will be to generate a procedural multiview dataset using a game engine and existing assets. The output should be video sequences, calibration parameters and groundtuth annotations for each point of view.
Ideally, the synthetic dataset will be built on top of www.swisstopo.admin.ch to be grounded in the real world and allow future study of transfer learning between synthetic and real data.
Prerequisites
The candidate should have Python programming experience.
Previous experience with Unity or other rendering engine is a plus. Knowledge about computer vision and camera model is recommended.