Machine learning for acoustics

(page under construction)

Introduction

Today, over 1.5 billion people – nearly 20% of the global population – experience hearing loss, with at least 430 million facing moderate to severe cases requiring intervention. By 2050, it is expected that 700 million people – or 1 in every 10 people – will have disabling hearing loss (hearing loss greater than 35 decibels in the better hearing ear). Hearing aids (HAs) play a vital role in improving quality of life for these individuals by amplifying sound pressure levels and improving the signal-to-noise ratio (SNR) in most settings.

While modern HAs fulfill basic hearing needs, they often fall short when it comes to spatial hearing – the ability to locate where sounds are coming from in a three-dimensional space. This is a critical aspect of natural hearing that helps with communication and safety in everyday environments. To meet this challenge, we’re exploring how machine learning can push hearing technology forward – making hearing aids smarter, faster to develop and better suited to individual needs.

How it works

Hearing aids address hearing loss by amplifying sound and enhancing SNR, but the human auditory system uses binaural cues – differences in signals received by both ears – that allow the brain to localise sound sources and create a sense of acoustic space, thus perceiving sound spatially. Binaural technologies in modern HAs attempt to replicate these cues to offer an immersive listening experience, reconstructing spatial hearing through sources accurately positioned in space. Better spatialisation of speakers and acoustic environments can improve speech intelligibility for hearing-impaired individuals.

To evaluate how well new algorithms preserve partly subjective spatial cues, traditional methods rely on expensive and time-consuming listening tests with human subjects, encompassing a lengthy process that includes obtaining ethical approval, recruiting participants, and conducting individual listening experiments. Machine learning offers a promising alternative. Instead of relying entirely on human testing, spatial perception models which provide objective metrics that can simulate human auditory localisation can be used to train and validate machine learning models.

For instance, the 3D localisation model we have built, shown in Figure 1, bypasses the manual extraction of auditory cues. Instead, it uses a two-step process: (1) calculating the firing rate dataset from auditory nerve fibers (ANF) and medial superior olive (MSO), and (2) predicting sound angles via a deep neural network (DNN). The model estimates the azimuth angle using MSO and ANF firing rates, and the elevation angle via the positive spectral gradient (PSG) of ANF activity. The localisation model incorporates the peripheral processing method, the MSO model and the DNN to emulate sound source directions.

Figure 1. Overview of the proposed auditory localisation model.

Binaural signals are first processed by a peripheral auditory model to generate ANF spikes, which are then converted into MSO spikes using a spiking auditory neuron model. A DNN predicts lateral and polar angles directly from the firing rates of MSO and ANF, respectively.

What are the applications

Using machine learning in acoustics presents many important benefits, especially when applied to hearing aids. For example, enhancing spatial hearing can boost the clarity and intelligibility of speech in complex environments, thus improving HA performance.

Utilising objective spatial perception models to reduce reliance on lengthy human listening tests greatly expedites the innovation process and decreases developmental costs.

Another application is that of personalised hearing technologies, as models that account for hearing loss can support tailored solutions for both normal-hearing and hearing-impaired users. Furthermore, incorporating these models into consumer audio technology could lead to more immersive virtual or augmented reality experiences.

What we are working on

This project – SPHERE (Spatial Hearing Models for Hearing Instruments) – is a continuation of a longstanding research collaboration between EPFL and Sonova, supported by Innosuisse.

Our 3D localisation model is suitable for both normal hearing and hearing-impaired listeners. Unlike many existing models, ours is not limited to the median plane, accounts for hearing loss and supports dynamic auditory cues. Regarding distance perception, the majority of models focus on achieving precise distance estimations from binaural signals rather than mimicking actual human behavior. We aim to develop a model that reflects real human behavior and incorporates the effects of hearing loss. As for externalization models, current iterations fail to consider the influence of hearing loss and dynamic situations. We are working on an advanced externalization model that integrates static auditory cues while also accommodating hearing loss and dynamic cues. Once these models are constructed, they will first be validated using existing open-source data. The corresponding listening tests will be subsequently designed and conducted based on the preliminary results.

List of projects associated with this topic

Funding body Project Period

CTI

Binaural Hearing-Aids with Localization and Spatialization BHA(L&S) 2012-2014

Innosuisse

Spatial Hearing Models for Hearing Instruments (SPHERE) 2023-2025

To learn more

2025

Active control of electroacoustic resonators in the audible regime: control strategies and airborne applications

M. Malléjac; M. Volery; H. Lissek; R. Fleury 

npj Acoustics. 2025. Vol. 1, num. 1. DOI : 10.1038/s44384-025-00006-9.

Active Acoustic Metamaterials: A New Way to Understand Nonlinear and Topological Phenomena

M. F. Padlewski / H. Lissek; R. Fleury (Dir.)  

Lausanne, EPFL, 2025. 

2024

Initial Conditions Impact on Nonlinear Dynamics of a Loudspeaker

R. Vesal; X. Guo; H. Lissek 

2024. Forum Acusticum 2023, the 10th Convention of the European Acoustics Association, Torino, Italy, 2023-09-11 – 2023-09-15. p. 5267 – 5272. DOI : 10.61782/fa.2023.1155.

Enhancing image quality in next-generation image compression

M. Testolina / T. Ebrahimi (Dir.)  

Lausanne, EPFL, 2024. 

2023

Observation of non-reciprocal harmonic conversion in real sounds

X. Guo; H. Lissek; R. Fleury 

Communications Physics. 2023. Vol. 6, num. 93, p. 1 – 6. DOI : 10.1038/s42005-023-01217-w.

2022

Corona discharge actuator as an active sound absorber under normal and oblique incidence

S. Sergeev; T. Humbert; H. Lissek; Y. Aurégan 

Acta Acustica. 2022. Vol. 6, p. 5. DOI : 10.1051/aacus/2022001.

PID-like active impedance control for electroacoustic resonators to design tunable single-degree-of-freedom sound absorbers

X. Guo; M. Volery; H. Lissek 

Journal of Sound and Vibration. 2022. Vol. 525, p. 116784. DOI : 10.1016/j.jsv.2022.116784.

2021

Sound Field Reconstruction in a room through Sparse Recovery and its application in Room Modal Equalization

V. T. Pham / P. Vandergheynst; H. Lissek (Dir.)  

Lausanne, EPFL, 2021. 

2020

Low frequency sound field reconstruction in a non-rectangular room using a small number of microphones

T. Pham Vu; H. Lissek 

Acta Acustica. 2020. Vol. 4, num. 2, p. 5. DOI : 10.1051/aacus/2020006.

Improving sound absorption through nonlinear active electroacoustic resonators

X. Guo; R. Fleury; H. Lissek 

Physical Review Applied. 2020. Vol. 13, p. 014018. DOI : 10.1103/PhysRevApplied.13.014018.

Sparse and Parametric Modeling with Applications to Acoustics and Audio

H. Peic Tukuljac / P. Vandergheynst; H. Lissek (Dir.)  

Lausanne, EPFL, 2020. 

2019

Perception of Auditory Distance in Normal-Hearing and Moderate-to-Profound Hearing-Impaired Listeners

G. Courtois; V. Grimaldi; H. Lissek; P. Estoppey; E. Georganti 

Trends in Hearing. 2019. Vol. 23, p. 233121651988761. DOI : 10.1177/2331216519887615.

2018

Low Frequency Sound Field Reconstruction in Non-rectangular Room: A Numerical Validation

T. Pham Vu; E. Rivet; H. Lissek 

2018. Euronoise 2018 – 11th European Congress and Exposition on Noise Control Engineering, Crete, Greece, May 27-31, 2018.

Joint Estimation Of The Room Geometry And Modes With Compressed Sensing

H. P. Tukuljac; Thach Pham Vu; H. Lissek; P. Vandergheynst 

2018. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, CANADA, Apr 15-20, 2018. p. 6882 – 6886. DOI : 10.1109/ICASSP.2018.8462655.

Experimental evaluation of speech enhancement methods in remote microphone systems for hearing aids

G. A. Courtois; V. Grimaldi; H. Lissek; I. Kodrasi; E. Georganti 

2018. Euronoise, Heraklion, Crete, Greece, May 27-31, 2018. p. 1 – 8.

Toward Wideband Steerable Acoustic Metasurfaces with Arrays of Active Electroacoustic Resonators

H. Lissek; E. Rivet; T. Laurence; R. Fleury 

Journal of Applied Physics. 2018. Vol. 123, num. 9, p. 091714. DOI : 10.1063/1.5011380.

2017

Design and experimental validation of an active acoustic liner for aircraft engine noise reduction

G. Matten; M. Ouisse; M. Collet; S. Karkar; H. Lissek et al. 

2017. MEDYNA 2017: 2nd Euro-Mediterranean Conference on Structural Dynamics and Vibroacoustics, Sevilla, Spain, April 25-27, 2017.

Localization of Sound Sources in a Room with One Microphone

H. Peic Tukuljac; H. Lissek; P. Vandergheynst 

2017. Wavelets and Sparsity XVII, San Diego, California, USA, August 6-9, 2017. DOI : 10.1117/12.2271249.

Generation of acoustic helical wavefronts using metasurfaces

H. Esfahlani; H. Lissek; J. R. Mosig 

Physical Review B. 2017. Vol. 95, num. 2, p. 024312. DOI : 10.1103/PhysRevB.95.024312.

Design of active Multiple-degrees-of-freedom electroacoustic resonators for use as broadband sound absorbers

H. Lissek; E. Rivet; S. Karkar; R. Boulandet 

2017. MEDYNA 2017: 2nd Euro-Mediterranean Conference on Structural Dynamics and Vibroacoustics, Sevilla, Spain, April 25-27, 2017.