A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments

Speech intelligibility in public places can be degraded by the environmental noise and reverberation. In this study, a new near-end listening enhancement (NELE) approach is proposed in which using a time varying filter jointly enhances the onsets and reduces the overlap masking. For optimization, so...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ali Fallah, Steven van de Par
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
T
Acceso en línea:https://doaj.org/article/ec31e9627a274e7796549d532c088217
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:ec31e9627a274e7796549d532c088217
record_format dspace
spelling oai:doaj.org-article:ec31e9627a274e7796549d532c0882172021-11-25T16:38:07ZA Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments10.3390/app1122107882076-3417https://doaj.org/article/ec31e9627a274e7796549d532c0882172021-11-01T00:00:00Zhttps://www.mdpi.com/2076-3417/11/22/10788https://doaj.org/toc/2076-3417Speech intelligibility in public places can be degraded by the environmental noise and reverberation. In this study, a new near-end listening enhancement (NELE) approach is proposed in which using a time varying filter jointly enhances the onsets and reduces the overlap masking. For optimization, some look-ahead in clean speech and prior knowledge of room impulse response (RIR) are required. In this method, by optimizing a defined cost function, the Spectro-Temporal Envelope of reverb speech is optimized to be as close as possible to that of clean speech. In this cost function, onsets of speech are optimized with increased weight. This approach is different from overlap-masking ratio (OMR) and speech enhancement (OE) approaches (Grosse, van de Par, 2017, J. Audio Eng. Soc., Vol. 65 (1/2), pp. 31–41) that only consider previous frames in each time slot for determining the time variant filtering. The SRT measurements show that the new optimization framework enhances the speech intelligibility up to 2 dB more that OE.Ali FallahSteven van de ParMDPI AGarticlespeech enhancementNELEreverberationspeech intelligibilityoptimizationTechnologyTEngineering (General). Civil engineering (General)TA1-2040Biology (General)QH301-705.5PhysicsQC1-999ChemistryQD1-999ENApplied Sciences, Vol 11, Iss 10788, p 10788 (2021)
institution DOAJ
collection DOAJ
language EN
topic speech enhancement
NELE
reverberation
speech intelligibility
optimization
Technology
T
Engineering (General). Civil engineering (General)
TA1-2040
Biology (General)
QH301-705.5
Physics
QC1-999
Chemistry
QD1-999
spellingShingle speech enhancement
NELE
reverberation
speech intelligibility
optimization
Technology
T
Engineering (General). Civil engineering (General)
TA1-2040
Biology (General)
QH301-705.5
Physics
QC1-999
Chemistry
QD1-999
Ali Fallah
Steven van de Par
A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments
description Speech intelligibility in public places can be degraded by the environmental noise and reverberation. In this study, a new near-end listening enhancement (NELE) approach is proposed in which using a time varying filter jointly enhances the onsets and reduces the overlap masking. For optimization, some look-ahead in clean speech and prior knowledge of room impulse response (RIR) are required. In this method, by optimizing a defined cost function, the Spectro-Temporal Envelope of reverb speech is optimized to be as close as possible to that of clean speech. In this cost function, onsets of speech are optimized with increased weight. This approach is different from overlap-masking ratio (OMR) and speech enhancement (OE) approaches (Grosse, van de Par, 2017, J. Audio Eng. Soc., Vol. 65 (1/2), pp. 31–41) that only consider previous frames in each time slot for determining the time variant filtering. The SRT measurements show that the new optimization framework enhances the speech intelligibility up to 2 dB more that OE.
format article
author Ali Fallah
Steven van de Par
author_facet Ali Fallah
Steven van de Par
author_sort Ali Fallah
title A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments
title_short A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments
title_full A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments
title_fullStr A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments
title_full_unstemmed A Speech Preprocessing Method Based on Perceptually Optimized Envelope Processing to Increase Intelligibility in Reverberant Environments
title_sort speech preprocessing method based on perceptually optimized envelope processing to increase intelligibility in reverberant environments
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/ec31e9627a274e7796549d532c088217
work_keys_str_mv AT alifallah aspeechpreprocessingmethodbasedonperceptuallyoptimizedenvelopeprocessingtoincreaseintelligibilityinreverberantenvironments
AT stevenvandepar aspeechpreprocessingmethodbasedonperceptuallyoptimizedenvelopeprocessingtoincreaseintelligibilityinreverberantenvironments
AT alifallah speechpreprocessingmethodbasedonperceptuallyoptimizedenvelopeprocessingtoincreaseintelligibilityinreverberantenvironments
AT stevenvandepar speechpreprocessingmethodbasedonperceptuallyoptimizedenvelopeprocessingtoincreaseintelligibilityinreverberantenvironments
_version_ 1718413112043896832