A high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning

Air traffic controller (ATC) fatigue is receiving considerable attention in recent studies because it represents a major cause of air traffic incidences. Research has revealed that the presence of fatigue can be detected by analysing speech utterances. However, constructing a complete labelled fatig...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Zhiyuan Shen, Yitao Wei
Formato: article
Lenguaje:EN
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://doaj.org/article/21a35614d4a84ad8b009ec0143e99a47
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:21a35614d4a84ad8b009ec0143e99a47
record_format dspace
spelling oai:doaj.org-article:21a35614d4a84ad8b009ec0143e99a472021-11-30T04:16:33ZA high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning2405-959510.1016/j.icte.2021.01.002https://doaj.org/article/21a35614d4a84ad8b009ec0143e99a472021-12-01T00:00:00Zhttp://www.sciencedirect.com/science/article/pii/S2405959521000023https://doaj.org/toc/2405-9595Air traffic controller (ATC) fatigue is receiving considerable attention in recent studies because it represents a major cause of air traffic incidences. Research has revealed that the presence of fatigue can be detected by analysing speech utterances. However, constructing a complete labelled fatigue data set is very time-consuming. Moreover, a manually constructed speech collection will often contain only little key information to be used effectively in fatigue recognition, while multilevel deep models based on such speech materials often have overfitting problems due to an explosive increase of model parameters. To address these problems, a novel deep learning framework is proposed in this study to integrate active learning (AL) into complex speech features selected from a large set of unlabelled speech data in order to overcome the loss of information. A shallow feature set is first extracted using stacked sparse autoencoder networks, in which fatigue state challenge features from a manually selected speaker set of are exploited as the input vector. A densely connected convolutional autoencoder (DCAE) is then proposed to learn advanced features automatically from spectrograms of the selected data to supplement the fatigue features. The network can be effectively trained using a relatively small number of labelled samples with the help of AL sampling strategies, and the addition of a dense block to the convolutional automatic encoder can decrease the number of parameters and make the model easier to fit. Finally, the two above-mentioned features are combined using multiple kernel learning with a support-vector-machine classifier. A series of comparative experiments using the Civil Aviation Administration of China radiotelephony corpus demonstrates that the proposed method provides a significant improvement in the detection precision compared to current state-of-the-art approaches.Zhiyuan ShenYitao WeiElsevierarticleAir traffic controlFatigueSSAEActive learningDense blockSpectrogramInformation technologyT58.5-58.64ENICT Express, Vol 7, Iss 4, Pp 403-413 (2021)
institution DOAJ
collection DOAJ
language EN
topic Air traffic control
Fatigue
SSAE
Active learning
Dense block
Spectrogram
Information technology
T58.5-58.64
spellingShingle Air traffic control
Fatigue
SSAE
Active learning
Dense block
Spectrogram
Information technology
T58.5-58.64
Zhiyuan Shen
Yitao Wei
A high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning
description Air traffic controller (ATC) fatigue is receiving considerable attention in recent studies because it represents a major cause of air traffic incidences. Research has revealed that the presence of fatigue can be detected by analysing speech utterances. However, constructing a complete labelled fatigue data set is very time-consuming. Moreover, a manually constructed speech collection will often contain only little key information to be used effectively in fatigue recognition, while multilevel deep models based on such speech materials often have overfitting problems due to an explosive increase of model parameters. To address these problems, a novel deep learning framework is proposed in this study to integrate active learning (AL) into complex speech features selected from a large set of unlabelled speech data in order to overcome the loss of information. A shallow feature set is first extracted using stacked sparse autoencoder networks, in which fatigue state challenge features from a manually selected speaker set of are exploited as the input vector. A densely connected convolutional autoencoder (DCAE) is then proposed to learn advanced features automatically from spectrograms of the selected data to supplement the fatigue features. The network can be effectively trained using a relatively small number of labelled samples with the help of AL sampling strategies, and the addition of a dense block to the convolutional automatic encoder can decrease the number of parameters and make the model easier to fit. Finally, the two above-mentioned features are combined using multiple kernel learning with a support-vector-machine classifier. A series of comparative experiments using the Civil Aviation Administration of China radiotelephony corpus demonstrates that the proposed method provides a significant improvement in the detection precision compared to current state-of-the-art approaches.
format article
author Zhiyuan Shen
Yitao Wei
author_facet Zhiyuan Shen
Yitao Wei
author_sort Zhiyuan Shen
title A high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning
title_short A high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning
title_full A high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning
title_fullStr A high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning
title_full_unstemmed A high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning
title_sort high-precision feature extraction network of fatigue speech from air traffic controller radiotelephony based on improved deep learning
publisher Elsevier
publishDate 2021
url https://doaj.org/article/21a35614d4a84ad8b009ec0143e99a47
work_keys_str_mv AT zhiyuanshen ahighprecisionfeatureextractionnetworkoffatiguespeechfromairtrafficcontrollerradiotelephonybasedonimproveddeeplearning
AT yitaowei ahighprecisionfeatureextractionnetworkoffatiguespeechfromairtrafficcontrollerradiotelephonybasedonimproveddeeplearning
AT zhiyuanshen highprecisionfeatureextractionnetworkoffatiguespeechfromairtrafficcontrollerradiotelephonybasedonimproveddeeplearning
AT yitaowei highprecisionfeatureextractionnetworkoffatiguespeechfromairtrafficcontrollerradiotelephonybasedonimproveddeeplearning
_version_ 1718406791523467264