Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification
Influential observations (IOs), which are outliers in the x direction, y direction or both, remain a problem in the classical regression model fitting. Spatial regression models have a peculiar kind of outliers because they are local in nature. Spatial regression models are also not free from the ef...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/3bcd8390d89d4cc888ff60c819805569 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:3bcd8390d89d4cc888ff60c819805569 |
---|---|
record_format |
dspace |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
spatial regression model influential observation outlier leverage prediction residual masking and swamping Mathematics QA1-939 |
spellingShingle |
spatial regression model influential observation outlier leverage prediction residual masking and swamping Mathematics QA1-939 Ali Mohammed Baba Habshah Midi Mohd Bakri Adam Nur Haizum Abd Rahman Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification |
description |
Influential observations (IOs), which are outliers in the x direction, y direction or both, remain a problem in the classical regression model fitting. Spatial regression models have a peculiar kind of outliers because they are local in nature. Spatial regression models are also not free from the effect of influential observations. Researchers have adapted some classical regression techniques to spatial models and obtained satisfactory results. However, masking or/and swamping remains a stumbling block for such methods. In this article, we obtain a measure of spatial Studentized prediction residuals that incorporate spatial information on the dependent variable and the residuals. We propose a robust spatial diagnostic plot to classify observations into regular observations, vertical outliers, good and bad leverage points using a classification based on spatial Studentized prediction residuals and spatial diagnostic potentials, which we refer to as <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>I</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula>. Observations that fall into the vertical outliers and bad leverage points categories are referred to as IOs. Representations of some classical regression measures of diagnostic in general spatial models are presented. The commonly used diagnostic measure in spatial diagnostics, the Cook’s distance, is compared to some robust methods, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi>H</mi><mi>i</mi><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> (using robust and non-robust measures), and our proposed <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>I</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> plots. Results of our simulation study and applications to real data showed that the Cook’s distance, non-robust <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi mathvariant="normal">H</mi><mrow><mi>si</mi><mn>1</mn></mrow><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> and robust <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi>H</mi><mrow><mi>s</mi><mi>i</mi><mn>2</mn></mrow><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> were not very successful in detecting IOs. The <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi mathvariant="normal">H</mi><mrow><mi>si</mi><mn>1</mn></mrow><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> suffered from the masking effect, and the robust <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi>H</mi><mrow><mi>s</mi><mi>i</mi><mn>2</mn></mrow><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> suffered from swamping in general spatial models. Interestingly, the results showed that the proposed <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> plot, followed by the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>I</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> plot, was very successful in classifying observations into the correct groups, hence correctly detecting the real IOs. |
format |
article |
author |
Ali Mohammed Baba Habshah Midi Mohd Bakri Adam Nur Haizum Abd Rahman |
author_facet |
Ali Mohammed Baba Habshah Midi Mohd Bakri Adam Nur Haizum Abd Rahman |
author_sort |
Ali Mohammed Baba |
title |
Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification |
title_short |
Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification |
title_full |
Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification |
title_fullStr |
Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification |
title_full_unstemmed |
Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification |
title_sort |
detection of influential observations in spatial regression model based on outliers and bad leverage classification |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/3bcd8390d89d4cc888ff60c819805569 |
work_keys_str_mv |
AT alimohammedbaba detectionofinfluentialobservationsinspatialregressionmodelbasedonoutliersandbadleverageclassification AT habshahmidi detectionofinfluentialobservationsinspatialregressionmodelbasedonoutliersandbadleverageclassification AT mohdbakriadam detectionofinfluentialobservationsinspatialregressionmodelbasedonoutliersandbadleverageclassification AT nurhaizumabdrahman detectionofinfluentialobservationsinspatialregressionmodelbasedonoutliersandbadleverageclassification |
_version_ |
1718410271419006976 |
spelling |
oai:doaj.org-article:3bcd8390d89d4cc888ff60c8198055692021-11-25T19:06:10ZDetection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification10.3390/sym131120302073-8994https://doaj.org/article/3bcd8390d89d4cc888ff60c8198055692021-10-01T00:00:00Zhttps://www.mdpi.com/2073-8994/13/11/2030https://doaj.org/toc/2073-8994Influential observations (IOs), which are outliers in the x direction, y direction or both, remain a problem in the classical regression model fitting. Spatial regression models have a peculiar kind of outliers because they are local in nature. Spatial regression models are also not free from the effect of influential observations. Researchers have adapted some classical regression techniques to spatial models and obtained satisfactory results. However, masking or/and swamping remains a stumbling block for such methods. In this article, we obtain a measure of spatial Studentized prediction residuals that incorporate spatial information on the dependent variable and the residuals. We propose a robust spatial diagnostic plot to classify observations into regular observations, vertical outliers, good and bad leverage points using a classification based on spatial Studentized prediction residuals and spatial diagnostic potentials, which we refer to as <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>I</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula>. Observations that fall into the vertical outliers and bad leverage points categories are referred to as IOs. Representations of some classical regression measures of diagnostic in general spatial models are presented. The commonly used diagnostic measure in spatial diagnostics, the Cook’s distance, is compared to some robust methods, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi>H</mi><mi>i</mi><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> (using robust and non-robust measures), and our proposed <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>I</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> and <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> plots. Results of our simulation study and applications to real data showed that the Cook’s distance, non-robust <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi mathvariant="normal">H</mi><mrow><mi>si</mi><mn>1</mn></mrow><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> and robust <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi>H</mi><mrow><mi>s</mi><mi>i</mi><mn>2</mn></mrow><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> were not very successful in detecting IOs. The <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi mathvariant="normal">H</mi><mrow><mi>si</mi><mn>1</mn></mrow><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> suffered from the masking effect, and the robust <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msubsup><mi>H</mi><mrow><mi>s</mi><mi>i</mi><mn>2</mn></mrow><mn>2</mn></msubsup></mrow></semantics></math></inline-formula> suffered from swamping in general spatial models. Interestingly, the results showed that the proposed <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>E</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> plot, followed by the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>I</mi><mi>S</mi><mi>R</mi><mi>s</mi><mo>−</mo><msub><mi>P</mi><mrow><mi>o</mi><mi>s</mi><mi>i</mi></mrow></msub></mrow></semantics></math></inline-formula> plot, was very successful in classifying observations into the correct groups, hence correctly detecting the real IOs.Ali Mohammed BabaHabshah MidiMohd Bakri AdamNur Haizum Abd RahmanMDPI AGarticlespatial regression modelinfluential observationoutlierleverageprediction residualmasking and swampingMathematicsQA1-939ENSymmetry, Vol 13, Iss 2030, p 2030 (2021) |