Multichannel Speech Enhancement in Vehicle Environment Based on Interchannel Attention Mechanism

Speech enhancement in a vehicle environment remains a challenging task for the complex noise. The paper presents a feature extraction method that we use interchannel attention mechanism frame by frame for learning spatial features directly from the multichannel speech waveforms. The spatial features...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xueli Shen, Zhenxing Liang, Shiyin Li, Yanji Jiang
Formato: article
Lenguaje:EN
Publicado: Hindawi-Wiley 2021
Materias:
Acceso en línea:https://doaj.org/article/e7c29eef5ac44c39ac0fc2dfc3aeb7fd
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Speech enhancement in a vehicle environment remains a challenging task for the complex noise. The paper presents a feature extraction method that we use interchannel attention mechanism frame by frame for learning spatial features directly from the multichannel speech waveforms. The spatial features of the individual signals learned through the proposed method are provided as an input so that the two-stage BiLSTM network is trained to perform adaptive spatial filtering as time-domain filters spanning signal channels. The two-stage BiLSTM network is capable of local and global features extracting and reaches competitive results. Using scenarios and data based on car cockpit simulations, in contrast to other methods that extract the feature from multichannel data, the results show the proposed method has a significant performance in terms of all SDR, SI-SNR, PESQ, and STOI.