Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics.
Replicability, the ability to replicate scientific findings, is a prerequisite for scientific discovery and clinical utility. Troublingly, we are in the midst of a replicability crisis. A key to replicability is that multiple measurements of the same item (e.g., experimental sample or clinical parti...
Guardado en:
Autores principales: | , , , , , , , , , , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/752c42dcd2e3499aba6419fe3e8c74d5 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:752c42dcd2e3499aba6419fe3e8c74d5 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:752c42dcd2e3499aba6419fe3e8c74d52021-12-02T19:57:47ZEliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics.1553-734X1553-735810.1371/journal.pcbi.1009279https://doaj.org/article/752c42dcd2e3499aba6419fe3e8c74d52021-09-01T00:00:00Zhttps://doi.org/10.1371/journal.pcbi.1009279https://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Replicability, the ability to replicate scientific findings, is a prerequisite for scientific discovery and clinical utility. Troublingly, we are in the midst of a replicability crisis. A key to replicability is that multiple measurements of the same item (e.g., experimental sample or clinical participant) under fixed experimental constraints are relatively similar to one another. Thus, statistics that quantify the relative contributions of accidental deviations-such as measurement error-as compared to systematic deviations-such as individual differences-are critical. We demonstrate that existing replicability statistics, such as intra-class correlation coefficient and fingerprinting, fail to adequately differentiate between accidental and systematic deviations in very simple settings. We therefore propose a novel statistic, discriminability, which quantifies the degree to which an individual's samples are relatively similar to one another, without restricting the data to be univariate, Gaussian, or even Euclidean. Using this statistic, we introduce the possibility of optimizing experimental design via increasing discriminability and prove that optimizing discriminability improves performance bounds in subsequent inference tasks. In extensive simulated and real datasets (focusing on brain imaging and demonstrating on genomics), only optimizing data discriminability improves performance on all subsequent inference tasks for each dataset. We therefore suggest that designing experiments and analyses to optimize discriminability may be a crucial step in solving the replicability crisis, and more generally, mitigating accidental measurement error.Eric W BridgefordShangsi WangZeyi WangTing XuCameron CraddockJayanta DeyGregory KiarWilliam Gray-RoncalCarlo ColantuoniChristopher DouvilleStephanie NobleCarey E PriebeBrian CaffoMichael MilhamXi-Nian ZuoConsortium for Reliability and ReproducibilityJoshua T VogelsteinPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 17, Iss 9, p e1009279 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Biology (General) QH301-705.5 |
spellingShingle |
Biology (General) QH301-705.5 Eric W Bridgeford Shangsi Wang Zeyi Wang Ting Xu Cameron Craddock Jayanta Dey Gregory Kiar William Gray-Roncal Carlo Colantuoni Christopher Douville Stephanie Noble Carey E Priebe Brian Caffo Michael Milham Xi-Nian Zuo Consortium for Reliability and Reproducibility Joshua T Vogelstein Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics. |
description |
Replicability, the ability to replicate scientific findings, is a prerequisite for scientific discovery and clinical utility. Troublingly, we are in the midst of a replicability crisis. A key to replicability is that multiple measurements of the same item (e.g., experimental sample or clinical participant) under fixed experimental constraints are relatively similar to one another. Thus, statistics that quantify the relative contributions of accidental deviations-such as measurement error-as compared to systematic deviations-such as individual differences-are critical. We demonstrate that existing replicability statistics, such as intra-class correlation coefficient and fingerprinting, fail to adequately differentiate between accidental and systematic deviations in very simple settings. We therefore propose a novel statistic, discriminability, which quantifies the degree to which an individual's samples are relatively similar to one another, without restricting the data to be univariate, Gaussian, or even Euclidean. Using this statistic, we introduce the possibility of optimizing experimental design via increasing discriminability and prove that optimizing discriminability improves performance bounds in subsequent inference tasks. In extensive simulated and real datasets (focusing on brain imaging and demonstrating on genomics), only optimizing data discriminability improves performance on all subsequent inference tasks for each dataset. We therefore suggest that designing experiments and analyses to optimize discriminability may be a crucial step in solving the replicability crisis, and more generally, mitigating accidental measurement error. |
format |
article |
author |
Eric W Bridgeford Shangsi Wang Zeyi Wang Ting Xu Cameron Craddock Jayanta Dey Gregory Kiar William Gray-Roncal Carlo Colantuoni Christopher Douville Stephanie Noble Carey E Priebe Brian Caffo Michael Milham Xi-Nian Zuo Consortium for Reliability and Reproducibility Joshua T Vogelstein |
author_facet |
Eric W Bridgeford Shangsi Wang Zeyi Wang Ting Xu Cameron Craddock Jayanta Dey Gregory Kiar William Gray-Roncal Carlo Colantuoni Christopher Douville Stephanie Noble Carey E Priebe Brian Caffo Michael Milham Xi-Nian Zuo Consortium for Reliability and Reproducibility Joshua T Vogelstein |
author_sort |
Eric W Bridgeford |
title |
Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics. |
title_short |
Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics. |
title_full |
Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics. |
title_fullStr |
Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics. |
title_full_unstemmed |
Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics. |
title_sort |
eliminating accidental deviations to minimize generalization error and maximize replicability: applications in connectomics and genomics. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/752c42dcd2e3499aba6419fe3e8c74d5 |
work_keys_str_mv |
AT ericwbridgeford eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT shangsiwang eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT zeyiwang eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT tingxu eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT cameroncraddock eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT jayantadey eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT gregorykiar eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT williamgrayroncal eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT carlocolantuoni eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT christopherdouville eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT stephanienoble eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT careyepriebe eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT briancaffo eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT michaelmilham eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT xinianzuo eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT consortiumforreliabilityandreproducibility eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics AT joshuatvogelstein eliminatingaccidentaldeviationstominimizegeneralizationerrorandmaximizereplicabilityapplicationsinconnectomicsandgenomics |
_version_ |
1718375784681308160 |