This dataset was built in the context of the Garzoni FNS-ANR project and consists of the manual annotations of ca. 54’000 apprenticeship contracts originating from the Accordi dei Garzoni, a document series from the Venetian State Archives.
More information soon!
From Documents to Structured Data: First Milestones of the Garzoni Project
Led by an interdisciplinary consortium, the Garzoni project undertakes the study of apprenticeship, work and society in early modern Venice by focusing on a specific archival source, namely the Accordi dei Garzoni from the Venetian State Archives. The project revolves around two main phases with, in the first instance, the design and the development of tools to extract and render information contained in the documents (according to Semantic Web standards) and, as a second step, the examination of such information. This paper outlines the main progress and achievements during the first year of the project.DHCommons. 2016. num. 2.
A Method for Record Linkage with Sparse Historical Data
Massive digitization of archival material, coupled with automatic document processing techniques and data visualisation tools offers great opportunities for reconstructing and exploring the past. Unprecedented wealth of historical data (e.g. names of persons, places, transaction records) can indeed be gathered through the transcription and annotation of digitized documents and thereby foster large-scale studies of past societies. Yet, the transformation of hand-written documents into well-represented, structured and connected data is not straightforward and requires several processing steps. In this regard, a key issue is entity record linkage, a process aiming at linking different mentions in texts which refer to the same entity. Also known as entity disambiguation, record linkage is essential in that it allows to identify genuine individuals, to aggregate multi-source information about single entities, and to reconstruct networks across documents and document series. In this paper we present an approach to automatically identify coreferential entity mentions of type Person in a data set derived from Venetian apprenticeship contracts from the early modern period (16th-18th c.). Taking advantage of a manually annotated sub-part of the document series, we compute distances between pairs of mentions, combining various similarity measures based on (sparse) context information and person attributes.2016. Digital Humanities Conference 2016, Krakow, Poland, July 11-16, 2016.