A Handwritten French Dataset for Word Spotting – CFRAMUZ

CFRAMUZ is the first historical dataset in French for segmentation-free word spotting, presented at the Historical Image Processing 2017 workshop in Kyoto.

The dataset consists of seven novels annotated with 18,000 words, in French, by the celebrated Lausannois writer Charles-Ferdinand Ramuz. The novels cover the writer’s whole working life (1910-1946), and show significant changes in handwriting style.

Download Links

The complete ground-truth of the dataset is available for download here. Along with the dataset, we provide an annotation tool here.


The research paper, presented at the Historical Image Processing workshop at ICDAR 2017, is available on the EPFL paper repository Infoscience here.