Importance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale

Machine learning spatial modeling is used for mapping the distribution of deep-sea polymetallic nodules (PMN). However, the presence and influence of spatial autocorrelation (SAC) have not been extensively studied. SAC can provide information regarding the variable selection before modeling, and it...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Iason-Zois Gazis, Jens Greinert
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/d897a978d4944298a824bf333972f4a3
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:d897a978d4944298a824bf333972f4a3
record_format dspace
spelling oai:doaj.org-article:d897a978d4944298a824bf333972f4a32021-11-25T18:25:58ZImportance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale10.3390/min111111722075-163Xhttps://doaj.org/article/d897a978d4944298a824bf333972f4a32021-10-01T00:00:00Zhttps://www.mdpi.com/2075-163X/11/11/1172https://doaj.org/toc/2075-163XMachine learning spatial modeling is used for mapping the distribution of deep-sea polymetallic nodules (PMN). However, the presence and influence of spatial autocorrelation (SAC) have not been extensively studied. SAC can provide information regarding the variable selection before modeling, and it results in erroneous validation performance when ignored. ML models are also problematic when applied in areas far away from the initial training locations, especially if the (new) area to be predicted covers another feature space. Here, we study the spatial distribution of PMN in a geomorphologically heterogeneous area of the Peru Basin, where SAC of PMN exists. The local Moran’s I analysis showed that there are areas with a significantly higher or lower number of PMN, associated with different backscatter values, aspect orientation, and seafloor geomorphological characteristics. A quantile regression forests (QRF) model is used using three cross-validation (CV) techniques (random-, spatial-, and cluster-blocking). We used the recently proposed “Area of Applicability” method to quantify the geographical areas where feature space extrapolation occurs. The results show that QRF predicts well in morphologically similar areas, with spatial block cross-validation being the least unbiased method. Conversely, random-CV overestimates the prediction performance. Under new conditions, the model transferability is reduced even on local scales, highlighting the need for spatial model-based dissimilarity analysis and transferability assessment in new areas.Iason-Zois GazisJens GreinertMDPI AGarticlepolymetallic nodulesspatial autocorrelationcross-validationmodel transferabilityMineralogyQE351-399.2ENMinerals, Vol 11, Iss 1172, p 1172 (2021)
institution DOAJ
collection DOAJ
language EN
topic polymetallic nodules
spatial autocorrelation
cross-validation
model transferability
Mineralogy
QE351-399.2
spellingShingle polymetallic nodules
spatial autocorrelation
cross-validation
model transferability
Mineralogy
QE351-399.2
Iason-Zois Gazis
Jens Greinert
Importance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale
description Machine learning spatial modeling is used for mapping the distribution of deep-sea polymetallic nodules (PMN). However, the presence and influence of spatial autocorrelation (SAC) have not been extensively studied. SAC can provide information regarding the variable selection before modeling, and it results in erroneous validation performance when ignored. ML models are also problematic when applied in areas far away from the initial training locations, especially if the (new) area to be predicted covers another feature space. Here, we study the spatial distribution of PMN in a geomorphologically heterogeneous area of the Peru Basin, where SAC of PMN exists. The local Moran’s I analysis showed that there are areas with a significantly higher or lower number of PMN, associated with different backscatter values, aspect orientation, and seafloor geomorphological characteristics. A quantile regression forests (QRF) model is used using three cross-validation (CV) techniques (random-, spatial-, and cluster-blocking). We used the recently proposed “Area of Applicability” method to quantify the geographical areas where feature space extrapolation occurs. The results show that QRF predicts well in morphologically similar areas, with spatial block cross-validation being the least unbiased method. Conversely, random-CV overestimates the prediction performance. Under new conditions, the model transferability is reduced even on local scales, highlighting the need for spatial model-based dissimilarity analysis and transferability assessment in new areas.
format article
author Iason-Zois Gazis
Jens Greinert
author_facet Iason-Zois Gazis
Jens Greinert
author_sort Iason-Zois Gazis
title Importance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale
title_short Importance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale
title_full Importance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale
title_fullStr Importance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale
title_full_unstemmed Importance of Spatial Autocorrelation in Machine Learning Modeling of Polymetallic Nodules, Model Uncertainty and Transferability at Local Scale
title_sort importance of spatial autocorrelation in machine learning modeling of polymetallic nodules, model uncertainty and transferability at local scale
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/d897a978d4944298a824bf333972f4a3
work_keys_str_mv AT iasonzoisgazis importanceofspatialautocorrelationinmachinelearningmodelingofpolymetallicnodulesmodeluncertaintyandtransferabilityatlocalscale
AT jensgreinert importanceofspatialautocorrelationinmachinelearningmodelingofpolymetallicnodulesmodeluncertaintyandtransferabilityatlocalscale
_version_ 1718411162121404416