Assessing the geographic specificity of pH prediction by classification and regression trees.

Soil pH effects a wide range of critical biogeochemical processes that dictate plant growth and diversity. Previous literature has established the capacity of classification and regression trees (CARTs) to predict soil pH, but limitations of CARTs in this context have not been fully explored. The cu...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Jacob Egelberg, Nina Pena, Rachel Rivera, Christina Andruk
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/dc5a9484cb8f4f3b928cfad97cd70d4b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:dc5a9484cb8f4f3b928cfad97cd70d4b
record_format dspace
spelling oai:doaj.org-article:dc5a9484cb8f4f3b928cfad97cd70d4b2021-12-02T20:15:05ZAssessing the geographic specificity of pH prediction by classification and regression trees.1932-620310.1371/journal.pone.0255119https://doaj.org/article/dc5a9484cb8f4f3b928cfad97cd70d4b2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0255119https://doaj.org/toc/1932-6203Soil pH effects a wide range of critical biogeochemical processes that dictate plant growth and diversity. Previous literature has established the capacity of classification and regression trees (CARTs) to predict soil pH, but limitations of CARTs in this context have not been fully explored. The current study collected soil pH, climatic, and topographic data from 100 locations across New York's Temperate Deciduous Forests (in the United States of America) to investigate the extrapolative capacity of a previously developed CART model as compared to novel CART and random forest (RF) models. Results showed that the previously developed CART underperformed in terms of predictive accuracy (RRMSE = 14.52%) when compared to a novel tree (RRMSE = 9.33%), and that a novel random forest outperformed both models (RRMSE = 8.88%), though its predictions did not differ significantly from the novel tree (p = 0.26). The most important predictors for model construction were climatic factors. These findings confirm existing reports that CART models are constrained by the spatial autocorrelation of geographic data and encourage the restricted application of relevant machine learning models to regions from which training data was collected. They also contradict previous literature implying that random forests should meaningfully boost the predictive accuracy of CARTs in the context of soil pH.Jacob EgelbergNina PenaRachel RiveraChristina AndrukPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 8, p e0255119 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Jacob Egelberg
Nina Pena
Rachel Rivera
Christina Andruk
Assessing the geographic specificity of pH prediction by classification and regression trees.
description Soil pH effects a wide range of critical biogeochemical processes that dictate plant growth and diversity. Previous literature has established the capacity of classification and regression trees (CARTs) to predict soil pH, but limitations of CARTs in this context have not been fully explored. The current study collected soil pH, climatic, and topographic data from 100 locations across New York's Temperate Deciduous Forests (in the United States of America) to investigate the extrapolative capacity of a previously developed CART model as compared to novel CART and random forest (RF) models. Results showed that the previously developed CART underperformed in terms of predictive accuracy (RRMSE = 14.52%) when compared to a novel tree (RRMSE = 9.33%), and that a novel random forest outperformed both models (RRMSE = 8.88%), though its predictions did not differ significantly from the novel tree (p = 0.26). The most important predictors for model construction were climatic factors. These findings confirm existing reports that CART models are constrained by the spatial autocorrelation of geographic data and encourage the restricted application of relevant machine learning models to regions from which training data was collected. They also contradict previous literature implying that random forests should meaningfully boost the predictive accuracy of CARTs in the context of soil pH.
format article
author Jacob Egelberg
Nina Pena
Rachel Rivera
Christina Andruk
author_facet Jacob Egelberg
Nina Pena
Rachel Rivera
Christina Andruk
author_sort Jacob Egelberg
title Assessing the geographic specificity of pH prediction by classification and regression trees.
title_short Assessing the geographic specificity of pH prediction by classification and regression trees.
title_full Assessing the geographic specificity of pH prediction by classification and regression trees.
title_fullStr Assessing the geographic specificity of pH prediction by classification and regression trees.
title_full_unstemmed Assessing the geographic specificity of pH prediction by classification and regression trees.
title_sort assessing the geographic specificity of ph prediction by classification and regression trees.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/dc5a9484cb8f4f3b928cfad97cd70d4b
work_keys_str_mv AT jacobegelberg assessingthegeographicspecificityofphpredictionbyclassificationandregressiontrees
AT ninapena assessingthegeographicspecificityofphpredictionbyclassificationandregressiontrees
AT rachelrivera assessingthegeographicspecificityofphpredictionbyclassificationandregressiontrees
AT christinaandruk assessingthegeographicspecificityofphpredictionbyclassificationandregressiontrees
_version_ 1718374613875949568