cual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies

ABSTRACT The number of samples in high-throughput comparative “omics” studies is increasing rapidly due to declining experimental costs. To keep sample data and metadata manageable and to ensure the integrity of scientific results as the scale of these projects continues to increase, it is essential...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: John H. Chase, Evan Bolyen, Jai Ram Rideout, J. Gregory Caporaso
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2016
Materias:
Acceso en línea:https://doaj.org/article/1a48cff42ca742b887f3818ea08abf6c
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:1a48cff42ca742b887f3818ea08abf6c
record_format dspace
spelling oai:doaj.org-article:1a48cff42ca742b887f3818ea08abf6c2021-12-02T19:45:29Zcual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies10.1128/mSystems.00010-152379-5077https://doaj.org/article/1a48cff42ca742b887f3818ea08abf6c2016-02-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mSystems.00010-15https://doaj.org/toc/2379-5077ABSTRACT The number of samples in high-throughput comparative “omics” studies is increasing rapidly due to declining experimental costs. To keep sample data and metadata manageable and to ensure the integrity of scientific results as the scale of these projects continues to increase, it is essential that we transition to better-designed sample identifiers. Ideally, sample identifiers should be globally unique across projects, project teams, and institutions; short (to facilitate manual transcription); correctable with respect to common types of transcription errors; opaque, meaning that they do not contain information about the samples; and compatible with existing standards. We present cual-id, a lightweight command line tool that creates, or mints, sample identifiers that meet these criteria without reliance on centralized infrastructure. cual-id allows users to assign universally unique identifiers, or UUIDs, that are globally unique to their samples. UUIDs are too long to be conveniently written on sampling materials, such as swabs or microcentrifuge tubes, however, so cual-id additionally generates human-friendly 4- to 12-character identifiers that map to their UUIDs and are unique within a project. By convention, we use “cual-id” to refer to the software, “CualID” to refer to the short, human-friendly identifiers, and “UUID” to refer to the globally unique identifiers. CualIDs are used by humans when they manually write or enter identifiers, while the longer UUIDs are used by computers to unambiguously reference a sample. Finally, cual-id optionally generates printable label sticker sheets containing Code 128 bar codes and CualIDs for labeling of sample collection and processing materials. IMPORTANCE The adoption of identifiers that are globally unique, correctable, and easily handwritten or manually entered into a computer will be a major step forward for sample tracking in comparative omics studies. As the fields transition to more-centralized sample management, for example, across labs within an institution, across projects funded under a common program, or in systems designed to facilitate meta- and/or integrated analysis, sample identifiers generated with cual-id will not need to change; thus, costly and error-prone updating of data and metadata identifiers will be avoided. Further, using cual-id will ensure that transcription errors in sample identifiers do not require the discarding of otherwise-useful samples that may have been expensive to obtain. Finally, cual-id is simple to install and use and is free for all use. No centralized infrastructure is required to ensure global uniqueness, so it is feasible for any lab to get started using these identifiers within their existing infrastructure.John H. ChaseEvan BolyenJai Ram RideoutJ. Gregory CaporasoAmerican Society for MicrobiologyarticlebioinformaticsmicrobiomemetagenomemetabolometranscriptomegenomesMicrobiologyQR1-502ENmSystems, Vol 1, Iss 1 (2016)
institution DOAJ
collection DOAJ
language EN
topic bioinformatics
microbiome
metagenome
metabolome
transcriptome
genomes
Microbiology
QR1-502
spellingShingle bioinformatics
microbiome
metagenome
metabolome
transcriptome
genomes
Microbiology
QR1-502
John H. Chase
Evan Bolyen
Jai Ram Rideout
J. Gregory Caporaso
cual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies
description ABSTRACT The number of samples in high-throughput comparative “omics” studies is increasing rapidly due to declining experimental costs. To keep sample data and metadata manageable and to ensure the integrity of scientific results as the scale of these projects continues to increase, it is essential that we transition to better-designed sample identifiers. Ideally, sample identifiers should be globally unique across projects, project teams, and institutions; short (to facilitate manual transcription); correctable with respect to common types of transcription errors; opaque, meaning that they do not contain information about the samples; and compatible with existing standards. We present cual-id, a lightweight command line tool that creates, or mints, sample identifiers that meet these criteria without reliance on centralized infrastructure. cual-id allows users to assign universally unique identifiers, or UUIDs, that are globally unique to their samples. UUIDs are too long to be conveniently written on sampling materials, such as swabs or microcentrifuge tubes, however, so cual-id additionally generates human-friendly 4- to 12-character identifiers that map to their UUIDs and are unique within a project. By convention, we use “cual-id” to refer to the software, “CualID” to refer to the short, human-friendly identifiers, and “UUID” to refer to the globally unique identifiers. CualIDs are used by humans when they manually write or enter identifiers, while the longer UUIDs are used by computers to unambiguously reference a sample. Finally, cual-id optionally generates printable label sticker sheets containing Code 128 bar codes and CualIDs for labeling of sample collection and processing materials. IMPORTANCE The adoption of identifiers that are globally unique, correctable, and easily handwritten or manually entered into a computer will be a major step forward for sample tracking in comparative omics studies. As the fields transition to more-centralized sample management, for example, across labs within an institution, across projects funded under a common program, or in systems designed to facilitate meta- and/or integrated analysis, sample identifiers generated with cual-id will not need to change; thus, costly and error-prone updating of data and metadata identifiers will be avoided. Further, using cual-id will ensure that transcription errors in sample identifiers do not require the discarding of otherwise-useful samples that may have been expensive to obtain. Finally, cual-id is simple to install and use and is free for all use. No centralized infrastructure is required to ensure global uniqueness, so it is feasible for any lab to get started using these identifiers within their existing infrastructure.
format article
author John H. Chase
Evan Bolyen
Jai Ram Rideout
J. Gregory Caporaso
author_facet John H. Chase
Evan Bolyen
Jai Ram Rideout
J. Gregory Caporaso
author_sort John H. Chase
title cual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies
title_short cual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies
title_full cual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies
title_fullStr cual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies
title_full_unstemmed cual-id: Globally Unique, Correctable, and Human-Friendly Sample Identifiers for Comparative Omics Studies
title_sort cual-id: globally unique, correctable, and human-friendly sample identifiers for comparative omics studies
publisher American Society for Microbiology
publishDate 2016
url https://doaj.org/article/1a48cff42ca742b887f3818ea08abf6c
work_keys_str_mv AT johnhchase cualidgloballyuniquecorrectableandhumanfriendlysampleidentifiersforcomparativeomicsstudies
AT evanbolyen cualidgloballyuniquecorrectableandhumanfriendlysampleidentifiersforcomparativeomicsstudies
AT jairamrideout cualidgloballyuniquecorrectableandhumanfriendlysampleidentifiersforcomparativeomicsstudies
AT jgregorycaporaso cualidgloballyuniquecorrectableandhumanfriendlysampleidentifiersforcomparativeomicsstudies
_version_ 1718376010291871744