Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants

The cochlea plays a key role in the transmission from acoustic vibration to neural stimulation upon which the brain perceives the sound. A cochlear implant (CI) is an auditory prosthesis to replace the damaged cochlear hair cells to achieve acoustic-to-neural conversion. However, the CI is a very co...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yuyong Kang, Nengheng Zheng, Qinglin Meng
Formato: article
Lenguaje:EN
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://doaj.org/article/f17011ae9f554ea9affee59b27518da1
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f17011ae9f554ea9affee59b27518da1
record_format dspace
spelling oai:doaj.org-article:f17011ae9f554ea9affee59b27518da12021-11-08T04:58:05ZDeep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants2296-858X10.3389/fmed.2021.740123https://doaj.org/article/f17011ae9f554ea9affee59b27518da12021-11-01T00:00:00Zhttps://www.frontiersin.org/articles/10.3389/fmed.2021.740123/fullhttps://doaj.org/toc/2296-858XThe cochlea plays a key role in the transmission from acoustic vibration to neural stimulation upon which the brain perceives the sound. A cochlear implant (CI) is an auditory prosthesis to replace the damaged cochlear hair cells to achieve acoustic-to-neural conversion. However, the CI is a very coarse bionic imitation of the normal cochlea. The highly resolved time-frequency-intensity information transmitted by the normal cochlea, which is vital to high-quality auditory perception such as speech perception in challenging environments, cannot be guaranteed by CIs. Although CI recipients with state-of-the-art commercial CI devices achieve good speech perception in quiet backgrounds, they usually suffer from poor speech perception in noisy environments. Therefore, noise suppression or speech enhancement (SE) is one of the most important technologies for CI. In this study, we introduce recent progress in deep learning (DL), mostly neural networks (NN)-based SE front ends to CI, and discuss how the hearing properties of the CI recipients could be utilized to optimize the DL-based SE. In particular, different loss functions are introduced to supervise the NN training, and a set of objective and subjective experiments is presented. Results verify that the CI recipients are more sensitive to the residual noise than the SE-induced speech distortion, which has been common knowledge in CI research. Furthermore, speech reception threshold (SRT) in noise tests demonstrates that the intelligibility of the denoised speech can be significantly improved when the NN is trained with a loss function bias to more noise suppression than that with equal attention on noise residue and speech distortion.Yuyong KangNengheng ZhengNengheng ZhengQinglin MengFrontiers Media S.A.articlecochlear implantspeech enhancementperceptual propertydeep learningloss functionMedicine (General)R5-920ENFrontiers in Medicine, Vol 8 (2021)
institution DOAJ
collection DOAJ
language EN
topic cochlear implant
speech enhancement
perceptual property
deep learning
loss function
Medicine (General)
R5-920
spellingShingle cochlear implant
speech enhancement
perceptual property
deep learning
loss function
Medicine (General)
R5-920
Yuyong Kang
Nengheng Zheng
Nengheng Zheng
Qinglin Meng
Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants
description The cochlea plays a key role in the transmission from acoustic vibration to neural stimulation upon which the brain perceives the sound. A cochlear implant (CI) is an auditory prosthesis to replace the damaged cochlear hair cells to achieve acoustic-to-neural conversion. However, the CI is a very coarse bionic imitation of the normal cochlea. The highly resolved time-frequency-intensity information transmitted by the normal cochlea, which is vital to high-quality auditory perception such as speech perception in challenging environments, cannot be guaranteed by CIs. Although CI recipients with state-of-the-art commercial CI devices achieve good speech perception in quiet backgrounds, they usually suffer from poor speech perception in noisy environments. Therefore, noise suppression or speech enhancement (SE) is one of the most important technologies for CI. In this study, we introduce recent progress in deep learning (DL), mostly neural networks (NN)-based SE front ends to CI, and discuss how the hearing properties of the CI recipients could be utilized to optimize the DL-based SE. In particular, different loss functions are introduced to supervise the NN training, and a set of objective and subjective experiments is presented. Results verify that the CI recipients are more sensitive to the residual noise than the SE-induced speech distortion, which has been common knowledge in CI research. Furthermore, speech reception threshold (SRT) in noise tests demonstrates that the intelligibility of the denoised speech can be significantly improved when the NN is trained with a loss function bias to more noise suppression than that with equal attention on noise residue and speech distortion.
format article
author Yuyong Kang
Nengheng Zheng
Nengheng Zheng
Qinglin Meng
author_facet Yuyong Kang
Nengheng Zheng
Nengheng Zheng
Qinglin Meng
author_sort Yuyong Kang
title Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants
title_short Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants
title_full Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants
title_fullStr Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants
title_full_unstemmed Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants
title_sort deep learning-based speech enhancement with a loss trading off the speech distortion and the noise residue for cochlear implants
publisher Frontiers Media S.A.
publishDate 2021
url https://doaj.org/article/f17011ae9f554ea9affee59b27518da1
work_keys_str_mv AT yuyongkang deeplearningbasedspeechenhancementwithalosstradingoffthespeechdistortionandthenoiseresidueforcochlearimplants
AT nenghengzheng deeplearningbasedspeechenhancementwithalosstradingoffthespeechdistortionandthenoiseresidueforcochlearimplants
AT nenghengzheng deeplearningbasedspeechenhancementwithalosstradingoffthespeechdistortionandthenoiseresidueforcochlearimplants
AT qinglinmeng deeplearningbasedspeechenhancementwithalosstradingoffthespeechdistortionandthenoiseresidueforcochlearimplants
_version_ 1718443056926031872