Research on filling methods of missing data in cultivated land quality evaluation

In the process of cultivated land quality data investigation and collection, there will be missing data due to human, environmental, and other factors. However, the current missing data-filling methods have insufficient applicability. In order to improve the cultivated land quality database and eval...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: CHEN Yu, ZHOU Wu, HU Yueming, XIE Jianwen
Formato: article
Lenguaje:ZH
Publicado: Agro-Environmental Protection Institute, Ministry of Agriculture 2021
Materias:
Acceso en línea:https://doaj.org/article/349a7eaa1425435c846571df3a02b94a
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:349a7eaa1425435c846571df3a02b94a
record_format dspace
spelling oai:doaj.org-article:349a7eaa1425435c846571df3a02b94a2021-12-03T02:29:42ZResearch on filling methods of missing data in cultivated land quality evaluation2095-681910.13254/j.jare.2021.0201https://doaj.org/article/349a7eaa1425435c846571df3a02b94a2021-11-01T00:00:00Zhttp://www.aed.org.cn/nyzyyhjxb/html/2021/6/20210620.htmhttps://doaj.org/toc/2095-6819In the process of cultivated land quality data investigation and collection, there will be missing data due to human, environmental, and other factors. However, the current missing data-filling methods have insufficient applicability. In order to improve the cultivated land quality database and evaluation accuracy, it is important to explore missing data-filling methods in cultivated land quality evaluation. In this study, the cultivated land quality database of Conghua District Guangzhou City was used as the sample set. According to the spatial correlation and spatial distribution, the dataset was divided into spatial and non-spatial correlation datasets. Various filling methods were used to simulate the missing data filling, and a cross method was used to verify the accuracy. The results indicated the proportion of total outliers was less than 1.2%, and 25 factors such as elevation, temperature, and available zinc showed spatial correlation. The four-image nearest neighbor algorithm presented the highest filling accuracy for spatial association data, and the accuracy was as high as 80% when the missing rate was less than 20%. The accuracy decreased with the increase in the missing rate. The four-image nearest neighbor algorithm was followed by K-nearest neighbor algorithm(KNN), expectation maximization algorithm, multiple interpolation algorithm, and regression model algorithm. The four-image nearest neighbor algorithm showed better accuracy than K-nearest neighbor algorithm when the data was dense. For the non-spatial correlation dataset, the highest filling accuracy was the similar aggregation filling algorithm, which could maintain more than 80% accuracy within 25% of the missing rate, followed by expectation maximization algorithm, multiple interpolation algorithm, and regression model algorithm. To sum up, the four-image nearest neighbor algorithm and the similar aggregation filling algorithm proposed in this study show higher accuracy, more stable effect, and wider practicability than other algorithms for filling missing data in cultivated land quality evaluation.CHEN YuZHOU WuHU YuemingXIE JianwenAgro-Environmental Protection Institute, Ministry of Agriculturearticleevaluation of cultivated land qualitymissingdatafillingconghua districtaccuracyAgriculture (General)S1-972Environmental sciencesGE1-350ZHJournal of Agricultural Resources and Environment, Vol 38, Iss 6, Pp 1132-1141 (2021)
institution DOAJ
collection DOAJ
language ZH
topic evaluation of cultivated land quality
missing
data
filling
conghua district
accuracy
Agriculture (General)
S1-972
Environmental sciences
GE1-350
spellingShingle evaluation of cultivated land quality
missing
data
filling
conghua district
accuracy
Agriculture (General)
S1-972
Environmental sciences
GE1-350
CHEN Yu
ZHOU Wu
HU Yueming
XIE Jianwen
Research on filling methods of missing data in cultivated land quality evaluation
description In the process of cultivated land quality data investigation and collection, there will be missing data due to human, environmental, and other factors. However, the current missing data-filling methods have insufficient applicability. In order to improve the cultivated land quality database and evaluation accuracy, it is important to explore missing data-filling methods in cultivated land quality evaluation. In this study, the cultivated land quality database of Conghua District Guangzhou City was used as the sample set. According to the spatial correlation and spatial distribution, the dataset was divided into spatial and non-spatial correlation datasets. Various filling methods were used to simulate the missing data filling, and a cross method was used to verify the accuracy. The results indicated the proportion of total outliers was less than 1.2%, and 25 factors such as elevation, temperature, and available zinc showed spatial correlation. The four-image nearest neighbor algorithm presented the highest filling accuracy for spatial association data, and the accuracy was as high as 80% when the missing rate was less than 20%. The accuracy decreased with the increase in the missing rate. The four-image nearest neighbor algorithm was followed by K-nearest neighbor algorithm(KNN), expectation maximization algorithm, multiple interpolation algorithm, and regression model algorithm. The four-image nearest neighbor algorithm showed better accuracy than K-nearest neighbor algorithm when the data was dense. For the non-spatial correlation dataset, the highest filling accuracy was the similar aggregation filling algorithm, which could maintain more than 80% accuracy within 25% of the missing rate, followed by expectation maximization algorithm, multiple interpolation algorithm, and regression model algorithm. To sum up, the four-image nearest neighbor algorithm and the similar aggregation filling algorithm proposed in this study show higher accuracy, more stable effect, and wider practicability than other algorithms for filling missing data in cultivated land quality evaluation.
format article
author CHEN Yu
ZHOU Wu
HU Yueming
XIE Jianwen
author_facet CHEN Yu
ZHOU Wu
HU Yueming
XIE Jianwen
author_sort CHEN Yu
title Research on filling methods of missing data in cultivated land quality evaluation
title_short Research on filling methods of missing data in cultivated land quality evaluation
title_full Research on filling methods of missing data in cultivated land quality evaluation
title_fullStr Research on filling methods of missing data in cultivated land quality evaluation
title_full_unstemmed Research on filling methods of missing data in cultivated land quality evaluation
title_sort research on filling methods of missing data in cultivated land quality evaluation
publisher Agro-Environmental Protection Institute, Ministry of Agriculture
publishDate 2021
url https://doaj.org/article/349a7eaa1425435c846571df3a02b94a
work_keys_str_mv AT chenyu researchonfillingmethodsofmissingdataincultivatedlandqualityevaluation
AT zhouwu researchonfillingmethodsofmissingdataincultivatedlandqualityevaluation
AT huyueming researchonfillingmethodsofmissingdataincultivatedlandqualityevaluation
AT xiejianwen researchonfillingmethodsofmissingdataincultivatedlandqualityevaluation
_version_ 1718373923457859584