Localizing category-related information in speech with multi-scale analyses.

Measurements of the physical outputs of speech-vocal tract geometry and acoustic energy-are high-dimensional, but linguistic theories posit a low-dimensional set of categories such as phonemes and phrase types. How can it be determined when and where in high-dimensional articulatory and acoustic sig...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Sam Tilsen, Seung-Eun Kim, Claire Wang
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/f6886a041f6c4acc9adc938b3350f949
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f6886a041f6c4acc9adc938b3350f949
record_format dspace
spelling oai:doaj.org-article:f6886a041f6c4acc9adc938b3350f9492021-12-02T20:17:23ZLocalizing category-related information in speech with multi-scale analyses.1932-620310.1371/journal.pone.0258178https://doaj.org/article/f6886a041f6c4acc9adc938b3350f9492021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0258178https://doaj.org/toc/1932-6203Measurements of the physical outputs of speech-vocal tract geometry and acoustic energy-are high-dimensional, but linguistic theories posit a low-dimensional set of categories such as phonemes and phrase types. How can it be determined when and where in high-dimensional articulatory and acoustic signals there is information related to theoretical categories? For a variety of reasons, it is problematic to directly quantify mutual information between hypothesized categories and signals. To address this issue, a multi-scale analysis method is proposed for localizing category-related information in an ensemble of speech signals using machine learning algorithms. By analyzing how classification accuracy on unseen data varies as the temporal extent of training input is systematically restricted, inferences can be drawn regarding the temporal distribution of category-related information. The method can also be used to investigate redundancy between subsets of signal dimensions. Two types of theoretical categories are examined in this paper: phonemic/gestural categories and syntactic relative clause categories. Moreover, two different machine learning algorithms were examined: linear discriminant analysis and neural networks with long short-term memory units. Both algorithms detected category-related information earlier and later in signals than would be expected given standard theoretical assumptions about when linguistic categories should influence speech. The neural network algorithm was able to identify category-related information to a greater extent than the discriminant analyses.Sam TilsenSeung-Eun KimClaire WangPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 10, p e0258178 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Sam Tilsen
Seung-Eun Kim
Claire Wang
Localizing category-related information in speech with multi-scale analyses.
description Measurements of the physical outputs of speech-vocal tract geometry and acoustic energy-are high-dimensional, but linguistic theories posit a low-dimensional set of categories such as phonemes and phrase types. How can it be determined when and where in high-dimensional articulatory and acoustic signals there is information related to theoretical categories? For a variety of reasons, it is problematic to directly quantify mutual information between hypothesized categories and signals. To address this issue, a multi-scale analysis method is proposed for localizing category-related information in an ensemble of speech signals using machine learning algorithms. By analyzing how classification accuracy on unseen data varies as the temporal extent of training input is systematically restricted, inferences can be drawn regarding the temporal distribution of category-related information. The method can also be used to investigate redundancy between subsets of signal dimensions. Two types of theoretical categories are examined in this paper: phonemic/gestural categories and syntactic relative clause categories. Moreover, two different machine learning algorithms were examined: linear discriminant analysis and neural networks with long short-term memory units. Both algorithms detected category-related information earlier and later in signals than would be expected given standard theoretical assumptions about when linguistic categories should influence speech. The neural network algorithm was able to identify category-related information to a greater extent than the discriminant analyses.
format article
author Sam Tilsen
Seung-Eun Kim
Claire Wang
author_facet Sam Tilsen
Seung-Eun Kim
Claire Wang
author_sort Sam Tilsen
title Localizing category-related information in speech with multi-scale analyses.
title_short Localizing category-related information in speech with multi-scale analyses.
title_full Localizing category-related information in speech with multi-scale analyses.
title_fullStr Localizing category-related information in speech with multi-scale analyses.
title_full_unstemmed Localizing category-related information in speech with multi-scale analyses.
title_sort localizing category-related information in speech with multi-scale analyses.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/f6886a041f6c4acc9adc938b3350f949
work_keys_str_mv AT samtilsen localizingcategoryrelatedinformationinspeechwithmultiscaleanalyses
AT seungeunkim localizingcategoryrelatedinformationinspeechwithmultiscaleanalyses
AT clairewang localizingcategoryrelatedinformationinspeechwithmultiscaleanalyses
_version_ 1718374439772487680