Identifying indicator species in ecological habitats using Deep Optimal Feature Learning.

Much of the current research on supervised modelling is focused on maximizing outcome prediction accuracy. However, in engineering disciplines, an arguably more important goal is that of feature extraction, the identification of relevant features associated with the various outcomes. For instance, i...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yiting Tsai, Susan A Baldwin, Bhushan Gopaluni
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/f505eb73dbce472593e0549e10c3d90d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f505eb73dbce472593e0549e10c3d90d
record_format dspace
spelling oai:doaj.org-article:f505eb73dbce472593e0549e10c3d90d2021-12-02T20:14:43ZIdentifying indicator species in ecological habitats using Deep Optimal Feature Learning.1932-620310.1371/journal.pone.0256782https://doaj.org/article/f505eb73dbce472593e0549e10c3d90d2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0256782https://doaj.org/toc/1932-6203Much of the current research on supervised modelling is focused on maximizing outcome prediction accuracy. However, in engineering disciplines, an arguably more important goal is that of feature extraction, the identification of relevant features associated with the various outcomes. For instance, in microbial communities, the identification of keystone species can often lead to improved prediction of future behavioral shifts. This paper proposes a novel feature extractor based on Deep Learning, which is largely agnostic to underlying assumptions regarding the training data. Starting from a collection of microbial species abundance counts, the Deep Learning model first trains itself to classify the selected distinct habitats. It then identifies indicator species associated with the habitats. The results are then compared and contrasted with those obtained by traditional statistical techniques. The indicator species are similar when compared at top taxonomic levels such as Domain and Phylum, despite visible differences in lower levels such as Class and Order. More importantly, when our estimated indicators are used to predict final habitat labels using simpler models (such as Support Vector Machines and traditional Artificial Neural Networks), the prediction accuracy is improved. Overall, this study serves as a preliminary step that bridges modern, black-box Machine Learning models with traditional, domain expertise-rich techniques.Yiting TsaiSusan A BaldwinBhushan GopaluniPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 9, p e0256782 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Yiting Tsai
Susan A Baldwin
Bhushan Gopaluni
Identifying indicator species in ecological habitats using Deep Optimal Feature Learning.
description Much of the current research on supervised modelling is focused on maximizing outcome prediction accuracy. However, in engineering disciplines, an arguably more important goal is that of feature extraction, the identification of relevant features associated with the various outcomes. For instance, in microbial communities, the identification of keystone species can often lead to improved prediction of future behavioral shifts. This paper proposes a novel feature extractor based on Deep Learning, which is largely agnostic to underlying assumptions regarding the training data. Starting from a collection of microbial species abundance counts, the Deep Learning model first trains itself to classify the selected distinct habitats. It then identifies indicator species associated with the habitats. The results are then compared and contrasted with those obtained by traditional statistical techniques. The indicator species are similar when compared at top taxonomic levels such as Domain and Phylum, despite visible differences in lower levels such as Class and Order. More importantly, when our estimated indicators are used to predict final habitat labels using simpler models (such as Support Vector Machines and traditional Artificial Neural Networks), the prediction accuracy is improved. Overall, this study serves as a preliminary step that bridges modern, black-box Machine Learning models with traditional, domain expertise-rich techniques.
format article
author Yiting Tsai
Susan A Baldwin
Bhushan Gopaluni
author_facet Yiting Tsai
Susan A Baldwin
Bhushan Gopaluni
author_sort Yiting Tsai
title Identifying indicator species in ecological habitats using Deep Optimal Feature Learning.
title_short Identifying indicator species in ecological habitats using Deep Optimal Feature Learning.
title_full Identifying indicator species in ecological habitats using Deep Optimal Feature Learning.
title_fullStr Identifying indicator species in ecological habitats using Deep Optimal Feature Learning.
title_full_unstemmed Identifying indicator species in ecological habitats using Deep Optimal Feature Learning.
title_sort identifying indicator species in ecological habitats using deep optimal feature learning.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/f505eb73dbce472593e0549e10c3d90d
work_keys_str_mv AT yitingtsai identifyingindicatorspeciesinecologicalhabitatsusingdeepoptimalfeaturelearning
AT susanabaldwin identifyingindicatorspeciesinecologicalhabitatsusingdeepoptimalfeaturelearning
AT bhushangopaluni identifyingindicatorspeciesinecologicalhabitatsusingdeepoptimalfeaturelearning
_version_ 1718374681517490176