Student projects ‒ LHST ‐ EPFL

This page lists the projects currently open at the Laboratory for the History of Science and Technology (LHST). If you are interested in working on one of the projects listed below, please write directly to the contact person, with Prof. Baudry cc’ed in your email.

The project descriptions are only brief outlines and we are in general flexible about the particulars.

This project, in collaboration with the Institute of Psychology (IP) from the University of Lausanne (UNIL) is the final part of the SNSF research project MICE conducted by Prof. Rémy Amouroux. The aim of MICE is to produce the first transnational historical study of the reception and indigenization of behavior therapy in the Francophone context between the early 1960s and the 1990s.

This student project addresses the circulation and reception of psychotherapy in French-speaking Europe through distant reading of two newsmagazines: Psychology Today and its French variation Psychologie. The goal is to trace the role played by the specialized press in disseminating behavior therapy to a large audience.

The corpus includes two magazines in French and English, available in bulk downloads with different OCR quality. These are: Psychology Today (1967-1971) and Psychologie (1970-1980). The goal of the project will be to observe whether there is a progressive rising of new psychotherapeutic tools and whether this phenomenon is concomitant/related to the rise of criticism towards psychoanalysis. The project will attempt to establish whether it is possible to observe coherent clusters of: critique of psychoanalysis; psychotherapy and new psychotherapeutic tools (ie. cognitive behavioral therapy).

This project will be done in collaboration with Prof. Jérôme Baudry (EPFL), Prof. Rémy Amouroux and Dr. Elsa Forner (UNIL).

Project type: semester project or master’s thesis.

Prerequisites: Prior experience in text mining; solid data analysis skills; knowledge in NLP and computational linguistics; interest in history and social sciences a plus; language skills in both English and French.

Contact: [email protected].

The project investigates how digital tools can be used to study the dynamics of innovation in science and technology, from the eighteenth century to today. Innovation—the production of the new—is often said to be radical and path-breaking; yet, what can history teach us about the actual rhythms of innovation? Looking at past centuries of technological development, do we see continuity or discontinuity? Can we identify waves of innovation (and imitation)? How new is the new and how did people strategically describe and draw technology to present it as new? How did inventors in the past address the dangers and potential negative consequences of their activity? Analyzing and/or building a corpus of patents of invention, you will choose and apply state-of-the-art NLP/machine learning methods and/or use your statistical and data science knowledge.

Below are some more detailed examples of themes for semester projects and master’s theses, but you can also propose your own questions. You will work with an interdisciplinary team of historians and computer scientists.

– Tracing international patent flows: Since the 19th century, as economic relationships became more and more globalized, individuals and corporations have increasingly patented their inventions in many countries simultaneously. Available statistics do not allow to answer questions such as: Which patents had counterparts in other countries? In which countries were patents covering the same technology to be found? Did the inventors usually patent in their country of residence first, or did they choose other countries? A computational analysis of digitized patent documents might help in answering such questions. While the textual descriptions of the inventions needed to be translated in the language of each country, and adapted to its legal system, the drawings contained in the document tended to be reused. Using computer vision techniques to match patents from different countries that feature the same drawings would shed light on the historical dynamics of international patenting and technology flows.

– Classifying patents and technology: Categorizing innovation is difficult. Categories are static, innovation is dynamic. Innovation can and does happen in-between established categories. Yet, it is very important to categorize patents, because the dynamic of innovation and the logic of taking out patents differ according to technology and to industry. However, such labels are usually missing from the datasets. The availability of the textual description of patents presents a great opportunity to address the challenge of classification.

– Extending the geographical scope of possible investigations: Most studies relying on the full text of historical patents rely on those issued by the United States of America, because of their easy availability in digitized form. To investigate similar questions for other countries, the available scanned material would need to be prepared and processed to be turned into clean digital full text, and results from off-the-shelf optical character recognition (OCR) software are of varying and sometimes questionable quality. This would be an interesting challenge for people interested in computer vision and OCR, and in bringing about a less one-sided view of innovation.

Project type: semester project or master’s thesis.

Prerequisites: prior experience in either text mining, NLP or computer vision; solid skills in data analysis and Python; prior experience with large datasets and/or working remotely on a server is a plus.

Contact : Jérôme Baudry ([email protected])

Open Science is an international movement aiming at making all scientific research productions—publications, data, software, methods—freely accessible to all people in society: researchers, amateurs, policy makers, industries, as well as artists, journalists, and activists. Open Science across the world relies heavily on the design and development of dedicated infrastructures, mostly platforms: digital libraries, data repositories (“as open as possible, as closed as necessary”), directories, online journals, web archives, computational services, MOOCs, content management systems (CMS), collaborative version control, etc.

These platforms are loosely bound as a network. For example, a publication on an online journal may refer, via a persistent identifier (such as a DOI), to a dataset hosted on a given repository. Another example is when an open science search engine may harvest the metadata of libraries and directories to index available publications.

The shape of the open science ecosystem online and the nature of the links that tie platforms together are not well known yet. The aim of this project is to crawl the web to identify the links between platforms, to characterize their nature, and to generate an interactive map of the open science network.

Level
Master (research project, optional research project, or master’s project).

Assistant responsible for the project
Simon Dumas Primbault ([email protected])

Possibility to work in group?
Yes.

For some years now, the term ‘ecosystem’ has been used by a number of research stakeholders to refer, in very different ways, to the environment in which their practice takes place. The numerous uses of the semantic field of ecology to understand the digital transition of research environments are, however, very diverse and very polarised.

The first part of this projet will be to assemble a vast heterogeneous corpus comprising writings of very different genres (scientific articles, policy briefs, reports, tribunes) as well as oral documents (courses, conferences, speeches) and metadata alone. This corpus will be built up by harvesting targeted sources: databases of scientific articles, crawling of research networks and infrastructures, archives of administrative institutions, course and conference repositories. Note: for the first instantiation of the project, a smaller and less heterogeneous corpus can be assembled.

From there, the project can take two different but complementary directions:

Use NLP to perform an automated systematic literature review in order to trace the emergence of the term, and other relevant semantic fields, their circulation and their crystallization.
Use network analysis in order to create a multi-layered network of documents, authors, concepts, institutions and disciplines and produce a diachronic map of the emergence and circulation of ecological discourse on research.

Project type: Master (research project, optional research project, or master’s project).

Possibility to work in group? Yes.

Contact: [email protected]

This project, in collaboration with the Institut des humanités en médecine (IHM), is part of the SNSF-funded research project MEDIF, led by Dr. Aude Fauvel and Prof. Rémy Amouroux. The aim of MEDIF is to examine the collective contribution of the first French and French-speaking Swiss women doctors to the development of the theoretical and practical frameworks of medicine between 1870 and 1940.

The proposed project examines the themes addressed and the ways in which doctors practiced medicine in the health manuals they wrote for the general public. The aim is to compare what is said in works written by women with those written by men.

The corpus consists of about one hundred health manuals written in French and published in both Switzerland and France between 1870 and 1940. These works are available in digitized form on library websites, with varying levels of OCR quality. The aim of the project is to determine the audiences for these textbooks and the themes covered, depending on the gender of the author, as well as the evolution of these themes. Particular attention will be given to passages dealing with women’s bodies and sexuality.

This project will be carried out in collaboration with Dr. Amélie Puche (IHM) and Prof. Jérôme Baudry (EPFL).

Project type: semester project or master’s thesis.

Contact: [email protected]