The EPFL-RLC dataset was recorded in the EPFL Rolex Learning Center using three static HD cameras. Unlike most of the existing multi-camera datasets, the cameras’ fields of view are overlapping. Each camera has a resolution of 1920×1080 pixels and during the acquisition a frame rate of 60 frames per second was used.
|Camera 1||Camera 2||Camera 3|
The video sequences which are available for download are synchronized across the views and each sequences contains 8K frames of post-processed frames of resolution 480×270.
The cameras are calibrated using the Tsai calibration [Tsai86]. The calibration files are included in the dataset.
The ground of the intersecting area is discretized as regular grid of points. The 3D space occupied if a person is standing at a particular position is modelled by a cylinder positioned centrally on the grid point. Each cylinder into the 2D views projects as a rectangle.
Using a 55×45 grid and the provided calibration files of the cameras, we generate a file containing each position’s projections into the three views. Each position is assigned an ID.
The annotations represent a set of multi-view examples extracted from the first frames. Note that the frames are not fully annotated, but rather multi-view examples of detections visible in all of the views are annotated. It consists of 4088 multi-view balanced manually annotated examples of two classes: occupied position – positive detection, and free position – a negative detection. Each multi-view annotation also contains the ID of the position it originates from, according to the positions file explained above.
|Positive multi-view example||Negative multi-view example|
Since a negative multi-view example does not necessarily imply that in each of its views there is no pedestrian, each negative multi-view example contains additional annotated information whether it contains a person. This information can be exploited if the dataset is used for monocular detection.
|[Tsai86]||R.Y. Tsai, An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami Beach, FL, pp. 364-374, 1986.