The race between researchers for the diagnosis of Covid-19 in the sound of a cough

At EPFL, at Cambridge University and at MIT, researchers are working to develop algorithms to diagnose people as positive for Covid-19 based on the sound of their cough, as recorded on a smartphone. The results can be stupefying, but difficult to reproduce.

No need to get tested: coughing on your phone could soon tell you if you have Covid-19. This, at least, is the ambition of researchers on either side of the Atlantic, who are counting on artificial intelligence to predict if a person is positive or not, based on the sound of their cough as recorded on a smartphone.

In early April, scientists started the Coughvid project in order to collect the sound of volunteers’ coughs online, a necessary step for the later development of the predictive model and the app itself. “We hope the application will attain a success rate of 70%, when enough data have been collected and put to use in the system, explained David Atienza, director of the Embedded System Laboratory of EPFL. Five months later, the team had gathered a body of more than 20,000 recordings, of which 1,500 were positive cases, to use in training their algorithmic models.

Contacted by ICTjournal, Tomás Teijeiro, research lead for Coughvid at EPFL, explains that many thousands of recordings have been submitted to pneumonologists, who were at pains to agree on the diagnoses, with the exception of a few clearly identifiable cases.

The researchers hope that artificial intelligence can do better. They have therefore provided their machine learning systems with the recordings (peripheral noise removed) and related demographic data (age, sex, etc.). “At the current time, we are able to detect 40% of infected people by the sound of their cough, with a false positive diagnosis arising only 3% of the time,” explains Teijeiro. The scientist admits that, for the moment, this is insufficient for a diagnostic app, or a diagnostic aid. The scientsts at EPFL are therefore forging ahead with their experiments…

And they are not the only ones. At Camridge University, researchers have also been crowdsourcing many thousands of cough recordings since spring 2020, in order to develop a diagnostics algorithm. In July, they announced a predictive model with a success rate of 80%.

But the most impressive results come from the other side of the Atlantic. In a recent article, MIT researchers have announced the development of a model capable of diagnosing 98.5% of cases, with less than 6% of false positives. Even with asymptomatic people, they claim their system can detect all cases of Covid-19 with less than 20% of false positives. Enough to get MIT scientists dreaming: “These AI techniques can provide a free, non-invase diagnostic system, on the grand scale, in real time, always available and distributed instantly to complement technologies currently aiming to halt the progress of COVID-19.” Enough for us to imagine practical applications, such as the daily diagnosis of students, workers and others in the general public.

At EPFL, Tomás Teijeiro is impressed by the performance obtained by the MIT researchers. Faced with results that are “almost too good to be true”, he nevertheless regrets that he is unable to test their model with the 20,000 recordings he has collected, nor can he test EPFL’s model with the MIT data.

This problem of reproducibility affects all sciences, but it is particularly prevalent in data science, where the predictive model can be based on many tiny adjustments. A partial remedy consists of sharing data with other researchers, which has already been done between scientists at Cambridge and EPFL. Or to challenge each other, for example on the Kaggle platform, inviting specialists to develop an algorithmic model using a set of training data in order to put a dataset to the test. Tomás Teijeiro is giving this last idea serious thought…

Article: Rodolphe Koller

Translation: John Maxwell

Originally published in:

Ven 06.11.2020 – 11:12