The Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data

The mitochondrial cytochrome C oxidase subunit I gene (COI) is commonly used in environmental DNA (eDNA) metabarcoding studies, especially for assessing metazoan diversity. Yet, a great number of COI operational taxonomic units (OTUs) or/and amplicon sequence variants (ASVs) retrieved from such stud...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Haris Zafeiropoulos, Laura Gargan, Sanni Hintikka, Christina Pavloudi, Jens Carlsson
Formato: article
Lenguaje:EN
Publicado: Pensoft Publishers 2021
Materias:
Acceso en línea:https://doaj.org/article/9c35018de6ff4c008eb7429e98131129
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:9c35018de6ff4c008eb7429e98131129
record_format dspace
spelling oai:doaj.org-article:9c35018de6ff4c008eb7429e981311292021-11-05T04:30:28ZThe Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data10.3897/mbmg.5.696572534-9708https://doaj.org/article/9c35018de6ff4c008eb7429e981311292021-11-01T00:00:00Zhttps://mbmg.pensoft.net/article/69657/download/pdf/https://mbmg.pensoft.net/article/69657/download/xml/https://mbmg.pensoft.net/article/69657/https://doaj.org/toc/2534-9708The mitochondrial cytochrome C oxidase subunit I gene (COI) is commonly used in environmental DNA (eDNA) metabarcoding studies, especially for assessing metazoan diversity. Yet, a great number of COI operational taxonomic units (OTUs) or/and amplicon sequence variants (ASVs) retrieved from such studies do not get a taxonomic assignment with a reference sequence. To assess and investigate such sequences, we have developed the Dark mAtteR iNvestigator (DARN) software tool. For this purpose, a reference COI-oriented phylogenetic tree was built from 1,593 consensus sequences covering all the three domains of life. With respect to eukaryotes, consensus sequences at the family level were constructed from 183,330 sequences retrieved from the Midori reference 2 database, which represented 70% of the initial number of reference sequences. Similarly, sequences from 431 bacterial and 15 archaeal taxa at the family level (29% and 1% of the initial number of reference sequences respectively) were retrieved from the BOLD and the PFam databases. DARN makes use of this phylogenetic tree to investigate COI pre-processed sequences of amplicon samples to provide both a tabular and a graphical overview of their phylogenetic assignments. To evaluate DARN, both environmental and bulk metabarcoding samples from different aquatic environments using various primer sets were analysed. We demonstrate that a large proportion of non-target prokaryotic organisms, such as bacteria and archaea, are also amplified in eDNA samples and we suggest prokaryotic COI sequences to be included in the reference databases used for the taxonomy assignment to allow for further analyses of dark matter. DARN source code is available on GitHub at https://github.com/hariszaf/darn and as a Docker image at https://hub.docker.com/r/hariszaf/darn.Haris ZafeiropoulosLaura GarganSanni HintikkaChristina PavloudiJens CarlssonPensoft PublishersarticleEcologyQH540-549.5ENMetabarcoding and Metagenomics, Vol 5, Iss , Pp 163-174 (2021)
institution DOAJ
collection DOAJ
language EN
topic Ecology
QH540-549.5
spellingShingle Ecology
QH540-549.5
Haris Zafeiropoulos
Laura Gargan
Sanni Hintikka
Christina Pavloudi
Jens Carlsson
The Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data
description The mitochondrial cytochrome C oxidase subunit I gene (COI) is commonly used in environmental DNA (eDNA) metabarcoding studies, especially for assessing metazoan diversity. Yet, a great number of COI operational taxonomic units (OTUs) or/and amplicon sequence variants (ASVs) retrieved from such studies do not get a taxonomic assignment with a reference sequence. To assess and investigate such sequences, we have developed the Dark mAtteR iNvestigator (DARN) software tool. For this purpose, a reference COI-oriented phylogenetic tree was built from 1,593 consensus sequences covering all the three domains of life. With respect to eukaryotes, consensus sequences at the family level were constructed from 183,330 sequences retrieved from the Midori reference 2 database, which represented 70% of the initial number of reference sequences. Similarly, sequences from 431 bacterial and 15 archaeal taxa at the family level (29% and 1% of the initial number of reference sequences respectively) were retrieved from the BOLD and the PFam databases. DARN makes use of this phylogenetic tree to investigate COI pre-processed sequences of amplicon samples to provide both a tabular and a graphical overview of their phylogenetic assignments. To evaluate DARN, both environmental and bulk metabarcoding samples from different aquatic environments using various primer sets were analysed. We demonstrate that a large proportion of non-target prokaryotic organisms, such as bacteria and archaea, are also amplified in eDNA samples and we suggest prokaryotic COI sequences to be included in the reference databases used for the taxonomy assignment to allow for further analyses of dark matter. DARN source code is available on GitHub at https://github.com/hariszaf/darn and as a Docker image at https://hub.docker.com/r/hariszaf/darn.
format article
author Haris Zafeiropoulos
Laura Gargan
Sanni Hintikka
Christina Pavloudi
Jens Carlsson
author_facet Haris Zafeiropoulos
Laura Gargan
Sanni Hintikka
Christina Pavloudi
Jens Carlsson
author_sort Haris Zafeiropoulos
title The Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data
title_short The Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data
title_full The Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data
title_fullStr The Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data
title_full_unstemmed The Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data
title_sort dark matter investigator (darn) tool: getting to know the known unknowns in coi amplicon data
publisher Pensoft Publishers
publishDate 2021
url https://doaj.org/article/9c35018de6ff4c008eb7429e98131129
work_keys_str_mv AT hariszafeiropoulos thedarkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT lauragargan thedarkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT sannihintikka thedarkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT christinapavloudi thedarkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT jenscarlsson thedarkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT hariszafeiropoulos darkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT lauragargan darkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT sannihintikka darkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT christinapavloudi darkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
AT jenscarlsson darkmatterinvestigatordarntoolgettingtoknowtheknownunknownsincoiamplicondata
_version_ 1718444544379322368