Explaining predictive models using Shapley values and non-parametric vine copulas

In this paper the goal is to explain predictions from complex machine learning models. One method that has become very popular during the last few years is Shapley values. The original development of Shapley values for prediction explanation relied on the assumption that the features being described...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Aas Kjersti, Nagler Thomas, Jullum Martin, Løland Anders
Formato: article
Lenguaje:EN
Publicado: De Gruyter 2021
Materias:
Acceso en línea:https://doaj.org/article/225427ea4f4d4ef4bd46be33a51c8970
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:225427ea4f4d4ef4bd46be33a51c8970
record_format dspace
spelling oai:doaj.org-article:225427ea4f4d4ef4bd46be33a51c89702021-12-05T14:10:46ZExplaining predictive models using Shapley values and non-parametric vine copulas2300-229810.1515/demo-2021-0103https://doaj.org/article/225427ea4f4d4ef4bd46be33a51c89702021-06-01T00:00:00Zhttps://doi.org/10.1515/demo-2021-0103https://doaj.org/toc/2300-2298In this paper the goal is to explain predictions from complex machine learning models. One method that has become very popular during the last few years is Shapley values. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. If the features in reality are dependent this may lead to incorrect explanations. Hence, there have recently been attempts of appropriately modelling/estimating the dependence between the features. Although the previously proposed methods clearly outperform the traditional approach assuming independence, they have their weaknesses. In this paper we propose two new approaches for modelling the dependence between the features. Both approaches are based on vine copulas, which are flexible tools for modelling multivariate non-Gaussian distributions able to characterise a wide range of complex dependencies. The performance of the proposed methods is evaluated on simulated data sets and a real data set. The experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than their competitors.Aas KjerstiNagler ThomasJullum MartinLøland AndersDe Gruyterarticleprediction explanationshapley valuesconditional distributionvine copulasnon-parametric62g0562h0568t0191a12Science (General)Q1-390MathematicsQA1-939ENDependence Modeling, Vol 9, Iss 1, Pp 62-81 (2021)
institution DOAJ
collection DOAJ
language EN
topic prediction explanation
shapley values
conditional distribution
vine copulas
non-parametric
62g05
62h05
68t01
91a12
Science (General)
Q1-390
Mathematics
QA1-939
spellingShingle prediction explanation
shapley values
conditional distribution
vine copulas
non-parametric
62g05
62h05
68t01
91a12
Science (General)
Q1-390
Mathematics
QA1-939
Aas Kjersti
Nagler Thomas
Jullum Martin
Løland Anders
Explaining predictive models using Shapley values and non-parametric vine copulas
description In this paper the goal is to explain predictions from complex machine learning models. One method that has become very popular during the last few years is Shapley values. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. If the features in reality are dependent this may lead to incorrect explanations. Hence, there have recently been attempts of appropriately modelling/estimating the dependence between the features. Although the previously proposed methods clearly outperform the traditional approach assuming independence, they have their weaknesses. In this paper we propose two new approaches for modelling the dependence between the features. Both approaches are based on vine copulas, which are flexible tools for modelling multivariate non-Gaussian distributions able to characterise a wide range of complex dependencies. The performance of the proposed methods is evaluated on simulated data sets and a real data set. The experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than their competitors.
format article
author Aas Kjersti
Nagler Thomas
Jullum Martin
Løland Anders
author_facet Aas Kjersti
Nagler Thomas
Jullum Martin
Løland Anders
author_sort Aas Kjersti
title Explaining predictive models using Shapley values and non-parametric vine copulas
title_short Explaining predictive models using Shapley values and non-parametric vine copulas
title_full Explaining predictive models using Shapley values and non-parametric vine copulas
title_fullStr Explaining predictive models using Shapley values and non-parametric vine copulas
title_full_unstemmed Explaining predictive models using Shapley values and non-parametric vine copulas
title_sort explaining predictive models using shapley values and non-parametric vine copulas
publisher De Gruyter
publishDate 2021
url https://doaj.org/article/225427ea4f4d4ef4bd46be33a51c8970
work_keys_str_mv AT aaskjersti explainingpredictivemodelsusingshapleyvaluesandnonparametricvinecopulas
AT naglerthomas explainingpredictivemodelsusingshapleyvaluesandnonparametricvinecopulas
AT jullummartin explainingpredictivemodelsusingshapleyvaluesandnonparametricvinecopulas
AT lølandanders explainingpredictivemodelsusingshapleyvaluesandnonparametricvinecopulas
_version_ 1718371708154413056