Conservation machine learning: a case study of random forests

Abstract Conservation machine learning conserves models across runs, users, and experiments—and puts them to good use. We have previously shown the merit of this idea through a small-scale preliminary experiment, involving a single dataset source, 10 datasets, and a single so-called cultivation meth...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Moshe Sipper, Jason H. Moore
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/b36e655ed8424d698572cc136b219b51
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:b36e655ed8424d698572cc136b219b51
record_format dspace
spelling oai:doaj.org-article:b36e655ed8424d698572cc136b219b512021-12-02T12:09:32ZConservation machine learning: a case study of random forests10.1038/s41598-021-83247-42045-2322https://doaj.org/article/b36e655ed8424d698572cc136b219b512021-02-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-83247-4https://doaj.org/toc/2045-2322Abstract Conservation machine learning conserves models across runs, users, and experiments—and puts them to good use. We have previously shown the merit of this idea through a small-scale preliminary experiment, involving a single dataset source, 10 datasets, and a single so-called cultivation method—used to produce the final ensemble. In this paper, focusing on classification tasks, we perform extensive experimentation with conservation random forests, involving 5 cultivation methods (including a novel one introduced herein—lexigarden), 6 dataset sources, and 31 datasets. We show that significant improvement can be attained by making use of models we are already in possession of anyway, and envisage the possibility of repositories of models (not merely datasets, solutions, or code), which could be made available to everyone, thus having conservation live up to its name, furthering the cause of data and computational science.Moshe SipperJason H. MooreNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-6 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Moshe Sipper
Jason H. Moore
Conservation machine learning: a case study of random forests
description Abstract Conservation machine learning conserves models across runs, users, and experiments—and puts them to good use. We have previously shown the merit of this idea through a small-scale preliminary experiment, involving a single dataset source, 10 datasets, and a single so-called cultivation method—used to produce the final ensemble. In this paper, focusing on classification tasks, we perform extensive experimentation with conservation random forests, involving 5 cultivation methods (including a novel one introduced herein—lexigarden), 6 dataset sources, and 31 datasets. We show that significant improvement can be attained by making use of models we are already in possession of anyway, and envisage the possibility of repositories of models (not merely datasets, solutions, or code), which could be made available to everyone, thus having conservation live up to its name, furthering the cause of data and computational science.
format article
author Moshe Sipper
Jason H. Moore
author_facet Moshe Sipper
Jason H. Moore
author_sort Moshe Sipper
title Conservation machine learning: a case study of random forests
title_short Conservation machine learning: a case study of random forests
title_full Conservation machine learning: a case study of random forests
title_fullStr Conservation machine learning: a case study of random forests
title_full_unstemmed Conservation machine learning: a case study of random forests
title_sort conservation machine learning: a case study of random forests
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/b36e655ed8424d698572cc136b219b51
work_keys_str_mv AT moshesipper conservationmachinelearningacasestudyofrandomforests
AT jasonhmoore conservationmachinelearningacasestudyofrandomforests
_version_ 1718394651294040064