An adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing

Abstract Background Multi-sample comparison is commonly used in cancer genomics studies. By using next-generation sequencing (NGS), a mutation's status in a specific sample can be measured by the number of reads supporting mutant or wildtype alleles. When no mutant reads are detected, it could...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Nicholas Hutson, Fenglin Zhan, James Graham, Mitsuko Murakami, Han Zhang, Sujana Ganaparti, Qiang Hu, Li Yan, Changxing Ma, Song Liu, Jun Xie, Lei Wei
Formato: article
Lenguaje:EN
Publicado: BMC 2021
Materias:
Acceso en línea:https://doaj.org/article/6ec5c0fa598645aab0521a3ef24101c0
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:6ec5c0fa598645aab0521a3ef24101c0
record_format dspace
spelling oai:doaj.org-article:6ec5c0fa598645aab0521a3ef24101c02021-12-05T12:05:24ZAn adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing10.1186/s12920-021-00880-81755-8794https://doaj.org/article/6ec5c0fa598645aab0521a3ef24101c02021-12-01T00:00:00Zhttps://doi.org/10.1186/s12920-021-00880-8https://doaj.org/toc/1755-8794Abstract Background Multi-sample comparison is commonly used in cancer genomics studies. By using next-generation sequencing (NGS), a mutation's status in a specific sample can be measured by the number of reads supporting mutant or wildtype alleles. When no mutant reads are detected, it could represent either a true negative mutation status or a false negative due to an insufficient number of reads, so-called "coverage". To minimize the chance of false-negative, we should consider the mutation status as "unknown" instead of "negative" when the coverage is inadequately low. There is no established method for determining the coverage threshold between negative and unknown statuses. A common solution is to apply a universal minimum coverage (UMC). However, this method relies on an arbitrarily chosen threshold, and it does not take into account the mutations' relative abundances, which can vary dramatically by the type of mutations. The result could be misclassification between negative and unknown statuses. Methods We propose an adaptive mutation-specific negative (MSN) method to improve the discrimination between negative and unknown mutation statuses. For a specific mutation, a non-positive sample is compared with every known positive sample to test the null hypothesis that they may contain the same frequency of mutant reads. The non-positive sample can only be claimed as “negative” when this null hypothesis is rejected with all known positive samples; otherwise, the status would be “unknown”. Results We first compared the performance of MSN and UMC methods in a simulated dataset containing varying tumor cell fractions. Only the MSN methods appropriately assigned negative statuses for samples with both high- and low-tumor cell fractions. When evaluated on a real dual-platform single-cell sequencing dataset, the MSN method not only provided more accurate assessments of negative statuses but also yielded three times more available data after excluding the “unknown” statuses, compared with the UMC method. Conclusions We developed a new adaptive method for distinguishing unknown from negative statuses in multi-sample comparison NGS data. The method can provide more accurate negative statuses than the conventional UMC method and generate a remarkably higher amount of available data by reducing unnecessary “unknown” calls.Nicholas HutsonFenglin ZhanJames GrahamMitsuko MurakamiHan ZhangSujana GanapartiQiang HuLi YanChangxing MaSong LiuJun XieLei WeiBMCarticleNegative statusTumor heterogeneityLiquid biopsyNext-generation sequencingGenetic testingPersonalized medicineInternal medicineRC31-1245GeneticsQH426-470ENBMC Medical Genomics, Vol 14, Iss S2, Pp 1-10 (2021)
institution DOAJ
collection DOAJ
language EN
topic Negative status
Tumor heterogeneity
Liquid biopsy
Next-generation sequencing
Genetic testing
Personalized medicine
Internal medicine
RC31-1245
Genetics
QH426-470
spellingShingle Negative status
Tumor heterogeneity
Liquid biopsy
Next-generation sequencing
Genetic testing
Personalized medicine
Internal medicine
RC31-1245
Genetics
QH426-470
Nicholas Hutson
Fenglin Zhan
James Graham
Mitsuko Murakami
Han Zhang
Sujana Ganaparti
Qiang Hu
Li Yan
Changxing Ma
Song Liu
Jun Xie
Lei Wei
An adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing
description Abstract Background Multi-sample comparison is commonly used in cancer genomics studies. By using next-generation sequencing (NGS), a mutation's status in a specific sample can be measured by the number of reads supporting mutant or wildtype alleles. When no mutant reads are detected, it could represent either a true negative mutation status or a false negative due to an insufficient number of reads, so-called "coverage". To minimize the chance of false-negative, we should consider the mutation status as "unknown" instead of "negative" when the coverage is inadequately low. There is no established method for determining the coverage threshold between negative and unknown statuses. A common solution is to apply a universal minimum coverage (UMC). However, this method relies on an arbitrarily chosen threshold, and it does not take into account the mutations' relative abundances, which can vary dramatically by the type of mutations. The result could be misclassification between negative and unknown statuses. Methods We propose an adaptive mutation-specific negative (MSN) method to improve the discrimination between negative and unknown mutation statuses. For a specific mutation, a non-positive sample is compared with every known positive sample to test the null hypothesis that they may contain the same frequency of mutant reads. The non-positive sample can only be claimed as “negative” when this null hypothesis is rejected with all known positive samples; otherwise, the status would be “unknown”. Results We first compared the performance of MSN and UMC methods in a simulated dataset containing varying tumor cell fractions. Only the MSN methods appropriately assigned negative statuses for samples with both high- and low-tumor cell fractions. When evaluated on a real dual-platform single-cell sequencing dataset, the MSN method not only provided more accurate assessments of negative statuses but also yielded three times more available data after excluding the “unknown” statuses, compared with the UMC method. Conclusions We developed a new adaptive method for distinguishing unknown from negative statuses in multi-sample comparison NGS data. The method can provide more accurate negative statuses than the conventional UMC method and generate a remarkably higher amount of available data by reducing unnecessary “unknown” calls.
format article
author Nicholas Hutson
Fenglin Zhan
James Graham
Mitsuko Murakami
Han Zhang
Sujana Ganaparti
Qiang Hu
Li Yan
Changxing Ma
Song Liu
Jun Xie
Lei Wei
author_facet Nicholas Hutson
Fenglin Zhan
James Graham
Mitsuko Murakami
Han Zhang
Sujana Ganaparti
Qiang Hu
Li Yan
Changxing Ma
Song Liu
Jun Xie
Lei Wei
author_sort Nicholas Hutson
title An adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing
title_short An adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing
title_full An adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing
title_fullStr An adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing
title_full_unstemmed An adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing
title_sort adaptive method of defining negative mutation status for multi-sample comparison using next-generation sequencing
publisher BMC
publishDate 2021
url https://doaj.org/article/6ec5c0fa598645aab0521a3ef24101c0
work_keys_str_mv AT nicholashutson anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT fenglinzhan anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT jamesgraham anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT mitsukomurakami anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT hanzhang anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT sujanaganaparti anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT qianghu anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT liyan anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT changxingma anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT songliu anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT junxie anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT leiwei anadaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT nicholashutson adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT fenglinzhan adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT jamesgraham adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT mitsukomurakami adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT hanzhang adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT sujanaganaparti adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT qianghu adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT liyan adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT changxingma adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT songliu adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT junxie adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
AT leiwei adaptivemethodofdefiningnegativemutationstatusformultisamplecomparisonusingnextgenerationsequencing
_version_ 1718372261365284864