The World as a Canvas: Multimodal Realignment of Historical Map Archives
Type : MSc thesis, MSc Semester project
Sections concerned : Digital humanities, Data Science, Computer Science
Over the last few years, national libraries and universities worldwide have digitized over 2 million historical maps. These images contain information on historical geography, but also about how the world was historically perceived and conceived. They thus represent instrumental data for geography, economic history, climate science, and the history of knowledge.
To fully exploit their potential as repositories of spatial knowledge, maps need to be realigned on contemporary geographic coordinates, i.e. georeferenced. Realignment makes it possible to compare the features depicted on the map with other maps or current geodata, and detect landscape changes, e.g., in forest coverage, or urban sprawl.
Whereas realignment, or georeferencing, traditionally relies on the manual annotation of historically persistent, spatially reliable homologous points (or ground control points), the sheer size of digitized map collections makes this approach impractical. Preliminary works have shown a potential in using place names, street intersections, or visual keypoints, to realign maps in constrained settings. Independently, however, none of these approaches is sufficient to realign maps at scale, due to the diversity of cartographic corpora and geographic spaces they depict.
This student project aims to overcome this obstacle by integrating textual cues with contextually rich latent image representation. It will build on a unique domain-specific vision foundation model (VFM) designed by our group in collaboration with the Institute of Cartography at ETH Zurich. As such, the project will exploit complex geographic-semantic information extracted from map images and match them to reference contemporary geodata, through descriptive keypoints, and graph neural networks.
Adopting a multimodal, holistic and generic perspective to the spatial realignment problem, this research project will mobilize varied skills, like computer vision and graph matching approaches, to address a key issue, with significant applications in spatial and geographic research.
Keywords: GIS, Computer vision, History of cartography
Contact : [email protected]

The Origins of Venice
Type : MSc Semester project
Sections concerned : Digital humanities, Data science, Computer science, Environmental Sciences & Engineering
The origins of Venetian settlement are still partially unknown. A small segment of historiography has tried to reconstruct the cartography and urban structure of Venice in 1100 and 1360.
The aim of the project is to create, for the first time and based on these historiographical sources processed through OCR, the digital urban atlas of Venice for the years 1100 and 1350, and to use it as a source for analyzing owners and land uses.
The final goal is to produce an interactive cartography (e.g. Leaflet) covering the 12th and 14th centuries. Further statistical analysis of ownership, land uses and family names are part of the project.
The resulting dataset will be shared online in the Parcels of Venice interface and analyses have the potential to be published in a scientific journal.
Keywords: Venice Time Machine, Digital Urban History, OCR
Prerequisites: Python programming, basics of JavaScript (optional)
Contact : [email protected]

The space of Victorian London
Type : MSc thesis, MSc Semester project
Sections concerned : Digital humanities, Urban Systems, Data Science, UNIL (Geography, HEC, SSP, Humanités numériques)
At the end of the 19th century, London is the capital of the largest empire in human history. A huge metropolis, a global administrative centre, but also an industrial powerhouse. Working-class people, sharing crowded tenements, live a few streets away from wealthy bourgeoisie, and the aristocracy. At that time, the city is arguably one of the first to face the challenges of a modern metropolis, including transportation, urban sprawl, air pollution, social inequalities, crime, and insalubrity. While the soil was being carved to make place for the underground train system, the surface witnessed the first examples of new urban forms, like squares, and semi-detached houses. During the 20th century, the city underwent significant changes, getting rid of its slums, and reinventing itself after the destructions from the Blitz.
The aim of this project is to study the impact of these interventions on the social-spatial structure of the modern city. In this prospect, the work will exploit a dataset of 1.2 million building footprints, and 286k text segments, automatically extracted from the 1890s map of London by the Ordnance Survey. The first step of the project will consist in enriching this dataset with information on infrastructure (e.g. pubs, hospitals, schools, churches, water supply network), and economic activity (e.g. industry, commerces). These data will then be integrated with information on social classes, extracted from Booth’s map of London, to model the relationship between spatial and social structures. Alternatively, the project may adopt a more environmental focus, considering, for example, historical tree cover, water networks, or the presence of polluting industries. This research should provide a better understanding of the long-term impact of historical, social and environmental conditions on the city as it is today.
Keywords: Digital urban history, Social history, Environmental history, GIS
Contact : [email protected]

Beyond Cheese and Chocolate: The Economic Landscape of Historical Switzerland
Type : MSc Semester project, BSc Semester project
Sections concerned : Digital humanities, Data Science, Computer Science, Urban Systems, UNIL (HEC, Geography, Humanités numériques)
The REFLEX project is a collaborative initiative focusing on the dynamics of industrialization, financialization, and globalization in the Swiss economy from 1883 to the present day. Over the past three years, the project has digitized the entirety of Switzerland’s commercial registers, documenting every company ever established in the country over more than 140 years, along with their life cycles (e.g., relocations, opening of subsidiaries, bankruptcies, dissolutions, etc.). As such, this dataset represents the most comprehensive source of information for understanding and studying the historical development of the modern Swiss economy.
While this database makes it possible to analyze the macroscopic dynamics of economic development at the national level, it does not yet allow for a detailed examination of the relationships between economic activity, regional development, and urban expansion. The aim of the present student project will be to address this gap by re-spatializing—i.e., realigning—the extracted addresses within a geographic coordinate system, and thus reconstruct the historical economic landscape of Switzerland. The work will draw upon both modern geospatial reference systems and historical place-name databases. Through this project, you will develop approaches to realign raw spatial data and investigate the resulting geohistorical database through spatial analysis, modeling, and visualization.
Ultimately, this work will also provide a deeper understanding of historical spatio-economic dynamics and their impact on contemporary Switzerland.
Keywords: Economic history, Spatial humanities, GIS
Contact : [email protected]

Enlumina
What is the graphic layout of a manuscript and with what kind of images are medieval texts illustrated? Is it possible to extract images from manuscripts and index the representations? Is it possible to define a repertoire of depictions that are similar and adapted to the manuscript text? Is it possible through analysis of visual similarities between decorative and figurative elements attribute authorial signatures to copyists or schools of copyists?
These are some of the questions that could be attempted to be answered through the Enlumina initiative conducted by EPFL in collaboration with the École National des chartes (Paris). The initiative plans to create a specific corpus of illustrated manuscripts and use image segmentation models to extract depictions. Through croisement with a database of iconographic elements, it will be possible to semantically annotate the elements that make up the depiction. The project methodology plans to test supervised methods based on the specific annotations to build a dataset of guided detection of image zones that identify specific elements: the Virgin Mary, the villager, the knight, etc.
The project also plans to test non-supervised methods thanks to multimodal LLM models that pose on automatic description.
Objectives of the project : to be able to define an effective methodology for indexing the elements of manuscript images.
Creating a training dataset specific to medieval manuscripts.
Creating a prototype search engine to search by textual keys for elements of individual depictions and to search by visual similarities.
Publication of the results obtained
Keywords: Images History, Authorship, Visual Similarities
Prerequisites: Machine Learning, Computer Vision, LLMs
Contact : [email protected]

Door-to-door
Type : MSc Master thesis, MSc Semester project
Sections concerned : Digital humanities, Humanités numériques (UNIL), Histoire (UNIL)
The city directories (contemporary yellow pages) first appeared in England and France in the second half of the 17th century, as lists of business addresses or addresses of important people, for the use of travelers, foreigners, or as a publication to validate the monumental importance of certain city places or homes of famous people. Directories are an exceptionally rich source for historical research, giving information on a more frequent basis than census data, and they list not only the taxable population in the Ancien Régime, (mostly male household head), but also represent a broader social corpus, especially women. In some cases, the information on commercial and professional activities have been recognized as the most significant source. The goal of the project is to conduct analyses on a 10-year sample corpus for Lausanne. This analysis can be followed by a comparative analysis of three cities: Lausanne (agricultural city), Paris (European capital city), and New York (cosmopolitan city) in the same year, 1880. The datasets of Paris and New York are already extracted. The project will entail extraction of the Lausanne sample through OCR. The aim is to understand the specific signature of each city by analyzing the trades first individually, and then comparatively. Datasets will be published online and potentially interesting results might be discussed in a paper.
Keywords: Document processing, OCR, Layout analysis, Digital Urban History
Prerequisites: Python
Contact : [email protected]

The Hidden Grammar of Art History
Type : MSc Semester project, BSc Semester project
Sections concerned : Digital humanities, Data Science, Computer Science
Analyzing the taxonomy of author’s name in art history is a key element in understanding artistic dynamics. Some artists have enjoyed great fortunes through the centuries as their works were widely copied; others operated thriving workshops, indicative of large-scale production for a potentially open and much larger market. Over the years, in the context of the Replica project, we have digitized 300,000 cardboards with images of art historical objects. Each cardboard is accompanied by relevant digitised metadata. In this dataset a specific field is the «name of the artist» that is articulated with a specific grammar. Early research results have already shown that analyzing taxonomy is a new and potentially very promising field for deriving a new theory of artistic practice between 1300 and 1900. The goal of the project is to work with the OCR we have extracted, accomplishing a realignment of entities and deriving some heuristics from it. An associated paper will be published as part of the project.
Keywords: Digital Art History, Authorship, Entities disambiguation, Data integration
Prerequisites: Python
Contact : [email protected]

Lausanne 1722
Type : MSc Semester project, BSc Semester Project
Sections concerned : Digital humanities, Data Science, UNIL (Histoire, Humanités numériques)
The creation of the Melotte cadastre in Lausanne began in 1721 and took end in 1722. This positions it as one of the oldest geometrical cadastral plans in both Europe and the world. While Lausanne had been previously mapped in a schematic cadastre in the 17th century, the Melotte cadastre marks the city’s first geometric survey. This document captures the state of Lausanne during the Enlightenment, including its surroundings, and holds significant value in the fields of urban history and cartography due to its role as precursor.
The Melotte plan has been georeferenced, i.e. geographically realigned, and automatically vectorized. The aim of this student project is to enrich these data with socio-economic information on land use, function, and ownership. The resulting historical geodata will then be used as a source for studying the landscape of the city as it was over three hundred years ago.
Keywords: Lausanne Time Machine, Digital Urban History, Social history, GIS
Contact : [email protected]

Other Projects
Semester project (DH)
There are more than 5M cultural heritage objects in Europeana’s repository, and many of these are historical postcards of Europe’s places. The goal of the semester project is to succeed in geolocalizing and geographically repositioning these representations through an AI semantic identifier. The methodology involves the development of a specific AI based prompting suitable for filtering content and repositioning images as accurately as possible. The goal is to redensify the iconography of European places, cities and territories, and publish access to the images in the Time Machine Atlas interface. Further analysis is also possible such as analyzing the coverage rate of the available information, which monuments are most represented and how to classify them, which among them are missing heritage, what their depiction implies with respect to a more general concept of cultural identity of the represented places.
In the course of the project it is planned to familiarize the manipulation of the API of Open AI and other AI providers.
Semester project (DH)
Cartonomics is a project developed by the Time Machine Unit with the aim to identify and extract the visual elements (symbolic and representational) that constitute the figurative language of cartography. This innovative research is based on the largest map corpus ever studied in the Digital Humanities and the History of Cartography.
The aim of the project ‘Visual semiotics’ is to produce a granular exploration of cartographic semantic. Some of the questions the project seeks to answer are : what is the evolutionary history of the representation of specific features, such as relief, and elevation, through the ages? How are these conventions affected by the presence of specific terrains, such as buildings or water, and how is that related to the historical measurement techniques? How is the natural landscape treated by the means of symbols and icons? The project will deploy deep learning technologies and statistical approach to anwer some of these questions.
Bachelor Project (CS)
Cultural evolution can be modelled as an ensemble of ideas, and conventions, that spread from one mind to another. This is referred to as memes, elementary replicators of culture inspired by the concept of a gene. A meme can be a tune, a catch-phrase, or an architectural detail, for instance. If we take the example of maps, a meme can be expressed in the choice of a certain texture, a colour, or a symbol, to represent the environment. Thus, memes spread from one cartographer to another, through time and space, and can be used to study cultural evolution. With the help of computer vision, it is now possible to track those memes, by finding replicated visual elements through large datasets of digitised historical maps.
By extracting visual elements from tens of thousands of maps and embedding these elements in a feature space, it becomes possible to estimate which elements correspond to the same replicated visual idea, or meme. This opens up new ways to understand how ideas and technologies spread across time and space.
Despite the immense potential that such data holds for better understanding cultural evolution, it remains difficult to interpret, since it involves tens of thousands memes, corresponding to millions of visual elements. Somewhat like genomics research, it is now becoming essential to develop a microscope to observe memetic interactions more closely.
In this project, the student will tackle the challenge of designing and building a prototype interface for the exploration of the memes on the basis of replicated visual elements. The scientific challenge is to create a design that reflects the layered and interconnected nature of the data, composed of visual elements, maps, and larger sets. The student will develop its project by working with digital embeddings of replicated visual elements, and digitised images in IIIF framework. The interface will make use of the metadata to visualise how time, space, as well as the development of new technologies influence cartographic figuration, by filtering the visual elements. Finally, to reflect the multi-layered nature of the data, the design must be transparent and provide the ability to switch between visual elements and their original maps.
The project will draw on an exceptional dataset of tens of thousands of American, French, German, Dutch, and Swiss maps published between 1500 and 1950. Depending on the student’s interests, an interesting case study to demonstrate the benefits of the interface could be to investigate the impact of the invention of lithography, a revolutionary technology from the end of the 18th century, on the development of modern cartographic representations
Semester project (Master, CS)
From 1812 onwards, land mapping campaigns were launched throughout Europe on the model of the so-called “Napoleonic” and later “renovated” cadastre. These maps covered the largest part of Western Europe, from Spain to the Netherlands, including Italy, the Austro-Hungarian Empire and the Confederation of the Rhine (Germany). The Napoleonic cadastre is thus one of the main historical sources of the 19th century and the hundreds of thousands of sheets that compose it are largely digitized. Georeferencing, i.e. spatial recalibration of the cadastral sheets, would constitute a fantastic added value for urban planning and environmental sciences by allowing large-scale studies on two centuries of territorial evolution. Within the framework of this project, the student will develop automatic methods based on computer vision techniques and graph neural networks to automate the realignment of cadastral sheets.
Master thesis (Unil, DH)
The Polier notebooks are a set of handwritten documents written by Jean-Henri Polier de Vernand (1715-1791), lieutenant baillival of Lausanne, under the title Livre de Raison, soit Memorial Universel. These notebooks, which total nearly 26,300 pages written between 1754 and 1791, are a precise account of the daily life of the time and of the micro-history of Lausanne. They contain a great deal of information about the economic and social history and details of daily life, such as trade or meteorology. The project aims at automatically transcribing the Polier notebooks using state-of-the-art automatic handwriting recognition (HTR) methods in order to finally unravel the mysteries of these pages. In a second step, an investigation based on distant reading and natural language processing (NLP) technologies will allow to deepen a research question on the themes of environmental, social, or economic history.
Master thesis (DH) + Semester project (Bachelor, Maths)
The Bibliothèque Historique de la Ville de Paris digitised more than 700 maps covering in detail the evolution of the city from 1836 (plan Jacoubet) to 1900, including the famous Atlases. They are of a particular interest in the urban studies of Paris, which was at the time heavily transfigured by Haussmanian transformations. For administrative and political reasons, the City of Paris did not benefit from the large cadastration campaigns that occurred throughout Napoleonic Europe at the beginning of the 19th century. Therefore the Atlas’s sheets are the finest source of information available on the city over the 19th century. In order to make use of the great potential of this dataset, the information contained in the maps must be converted from its visual representation to an abstract, geometrical, form, on which to base quantitative studies. This conversion from an input raster to an output vector geometry is typically well-performed by combining Convolutional Neural Networks for pixelwise classification to standard computer vision techniques. In a second time, the student will develop quantitative techniques to investigate the transformation of the urban fabric.
Semester project (Master, CS)
At the end of the 19th century, a global will for administrative reform brought about the need to renovate the old cadastres dating from the Napoleonic era. Cadastral campaigns of cities and territories were launched by applying the best technologies of the time. The canton of Vaud took part in the movement and published in 1888 the renovated cadastre of Lausanne, a collection of more than 300 plates of high graphic quality and remarkable accuracy. It includes a complete legend that includes the use of each parcel (house, café, bakery, field, vineyard, etc.) and the owner. The automatic vectorization project aims to use state-of-the-art technologies developed in the framework of the Time Machine initiative to transform the historical images into vector maps using CNN-based semantic segmentation. For the legend, automatic document processing and handwritten text recognition (HTR) methods could also be deployed. This work would pave the way to a better understanding of a key period of Lausanne’s history that saw the birth of the city we know today. The vectorization of the document will also open new avenues of investigation for the student to explore the palimpsest of Lausanne’s history.
Semester project (Master, CS)
The Historical Museum of Lausanne has a collection of more than 38’500 digitized images, including numerous shots of Lausanne and photographs of buildings. These archives are the keystone of the historical reconstruction of the city because they constitute a direct trace of the light of the past. In this project, the student will develop automatic methods based on photogrammetry and neural networks to match the images and infer the angle and position of the photographic shot. This work will open the possibility of directly reprojecting historical photographs onto 4D models.
Past Projects
Do not hesitate to contact us to discuss your own project ideas on a specific document or theme.


