Undersampling bankruptcy prediction: Taiwan bankruptcy data.
Machine learning models have increasingly been used in bankruptcy prediction. However, the observed historical data of bankrupt companies are often affected by data imbalance, which causes incorrect prediction, resulting in substantial economic losses. Many studies have proposed the insolvency imbal...
Guardado en:
Autores principales: | , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/e5768835bfc844bbb9e898bdc224e09b |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:e5768835bfc844bbb9e898bdc224e09b |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:e5768835bfc844bbb9e898bdc224e09b2021-12-02T20:09:45ZUndersampling bankruptcy prediction: Taiwan bankruptcy data.1932-620310.1371/journal.pone.0254030https://doaj.org/article/e5768835bfc844bbb9e898bdc224e09b2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0254030https://doaj.org/toc/1932-6203Machine learning models have increasingly been used in bankruptcy prediction. However, the observed historical data of bankrupt companies are often affected by data imbalance, which causes incorrect prediction, resulting in substantial economic losses. Many studies have proposed the insolvency imbalance problem, but little attention has been paid to the effect of the undersampling technology. Therefore, a framework is used to spot-check algorithms quickly and combine which undersampling method and classification model performs best. The results show that Naive Bayes (NB) after Edited Nearest Neighbors (ENN) has the best performance, with an F2-measure of 0.423. In addition, by changing the undersampling rate of the cluster centroid-based method, we find that the performance of the Linear Discriminant Analysis (LDA) and Naive Bayes (NB) are affected by the undersampling rate. Neither of them is uniformly declining, and LDA has higher performance when the undersampling rate is 30%. This study accordingly provides another perspective and a guide for future design.Haoming WangXiangdong LiuPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 7, p e0254030 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Haoming Wang Xiangdong Liu Undersampling bankruptcy prediction: Taiwan bankruptcy data. |
description |
Machine learning models have increasingly been used in bankruptcy prediction. However, the observed historical data of bankrupt companies are often affected by data imbalance, which causes incorrect prediction, resulting in substantial economic losses. Many studies have proposed the insolvency imbalance problem, but little attention has been paid to the effect of the undersampling technology. Therefore, a framework is used to spot-check algorithms quickly and combine which undersampling method and classification model performs best. The results show that Naive Bayes (NB) after Edited Nearest Neighbors (ENN) has the best performance, with an F2-measure of 0.423. In addition, by changing the undersampling rate of the cluster centroid-based method, we find that the performance of the Linear Discriminant Analysis (LDA) and Naive Bayes (NB) are affected by the undersampling rate. Neither of them is uniformly declining, and LDA has higher performance when the undersampling rate is 30%. This study accordingly provides another perspective and a guide for future design. |
format |
article |
author |
Haoming Wang Xiangdong Liu |
author_facet |
Haoming Wang Xiangdong Liu |
author_sort |
Haoming Wang |
title |
Undersampling bankruptcy prediction: Taiwan bankruptcy data. |
title_short |
Undersampling bankruptcy prediction: Taiwan bankruptcy data. |
title_full |
Undersampling bankruptcy prediction: Taiwan bankruptcy data. |
title_fullStr |
Undersampling bankruptcy prediction: Taiwan bankruptcy data. |
title_full_unstemmed |
Undersampling bankruptcy prediction: Taiwan bankruptcy data. |
title_sort |
undersampling bankruptcy prediction: taiwan bankruptcy data. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/e5768835bfc844bbb9e898bdc224e09b |
work_keys_str_mv |
AT haomingwang undersamplingbankruptcypredictiontaiwanbankruptcydata AT xiangdongliu undersamplingbankruptcypredictiontaiwanbankruptcydata |
_version_ |
1718375095818256384 |