End-to-end audio-visual learning for cochlear implant sound coding simulations in noisy environments.

/ Authors

Meng-Ping Lin, Enoch Hsin-Ho Huang, Shao-Yi Chien, Yu Tsao

/ Abstract

The cochlear implant (CI) is a successful biomedical device that enables individuals with severe-to-profound hearing loss to perceive sound through electrical stimulation, yet listening in noise remains challenging. Recent deep learning advances offer promising potential for CI sound coding by integrating visual cues. In this study, an audio-visual speech enhancement (AVSE) module is integrated with the ElectrodeNet-CS (ECS) model to form the end-to-end CI system, AVSE-ECS. Simulations show that the AVSE-ECS system with joint training achieves high objective speech intelligibility and improves the signal-to-error ratio by 7.4666 dB compared to the advanced combination encoder strategy. These findings underscore the potential of AVSE-based CI sound coding.

Journal: JASA express letters

DOI: 10.1121/10.0042198