EPFL-RLC Multi-Camera Dataset

The EPFL-RLC dataset was recorded in the EPFL Rolex Learning Center using three static HD cameras. Unlike most of the existing multi-camera datasets, the cameras’ fields of view are overlapping. Each camera has a resolution of 1920×1080 pixels and during the acquisition a frame rate of 60 frames per second was used.

Camera 1 Camera 2 Camera 3

The video sequences which are available for download are synchronized across the views and each sequences contains 8K frames of post-processed frames of resolution 480×270.

The cameras are calibrated using the Tsai calibration [Tsai86]. The calibration files are included in the dataset.

Positions file

The ground of the intersecting area is discretized as regular grid of points. The 3D space occupied if a person is standing at a particular position is modelled by a cylinder positioned centrally on the grid point. Each cylinder into the 2D views projects as a rectangle.

Using a 55×45 grid and the provided calibration files of the cameras, we generate a file containing each position’s projections into the three views. Each position is assigned an ID.


The annotations represent a set of multi-view examples extracted from the first frames. Note that the frames are not fully annotated, but rather multi-view examples of detections visible in all of the views are annotated. It consists of 4088 multi-view balanced manually annotated examples of two classes: occupied position – positive detection, and free position – a negative detection. Each multi-view annotation also contains the ID of the position it originates from, according to the positions file explained above.

Positive multi-view example Negative multi-view example

Since a negative multi-view example does not necessarily imply that in each of its views there is no pedestrian, each negative multi-view example contains additional annotated information whether it contains a person. This information can be exploited if the dataset is used for monocular detection.


[Tsai86] R.Y. Tsai, An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami Beach, FL, pp. 364-374, 1986.


Deep Multi-Camera People Detection

T. Chavdarova; F. Fleuret 

2017. Proceedings of the IEEE International Conference on Machine Learning and Applications.