Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest

This paper proposes a method, called autoencoder with probabilistic random forest (AE-PRF), for detecting credit card frauds. The proposed AE-PRF method first utilizes the autoencoder to extract features of low-dimensionality from credit card transaction data features of high-dimensionality. It then...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Tzu-Hsuan Lin, Jehn-Ruey Jiang
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	autoencoder credit card deep learning fraud detection data imbalance random forest Mathematics QA1-939
Acceso en línea:	https://doaj.org/article/b9a662d924844c6490938fa55e24cbf8
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:b9a662d924844c6490938fa55e24cbf8
record_format	dspace
spelling	oai:doaj.org-article:b9a662d924844c6490938fa55e24cbf82021-11-11T18:15:02ZCredit Card Fraud Detection with Autoencoder and Probabilistic Random Forest10.3390/math92126832227-7390https://doaj.org/article/b9a662d924844c6490938fa55e24cbf82021-10-01T00:00:00Zhttps://www.mdpi.com/2227-7390/9/21/2683https://doaj.org/toc/2227-7390This paper proposes a method, called autoencoder with probabilistic random forest (AE-PRF), for detecting credit card frauds. The proposed AE-PRF method first utilizes the autoencoder to extract features of low-dimensionality from credit card transaction data features of high-dimensionality. It then relies on the random forest, an ensemble learning mechanism using the bootstrap aggregating (bagging) concept, with probabilistic classification to classify data as fraudulent or normal. The credit card fraud detection (CCFD) dataset is applied to AE-PRF for performance evaluation and comparison. The CCFD dataset contains large numbers of credit card transactions of European cardholders; it is highly imbalanced since its normal transactions far outnumber fraudulent transactions. Data resampling schemes like the synthetic minority oversampling technique (SMOTE), adaptive synthetic (ADASYN), and Tomek link (T-Link) are applied to the CCFD dataset to balance the numbers of normal and fraudulent transactions for improving AE-PRF performance. Experimental results show that the performance of AE-PRF does not vary much whether resampling schemes are applied to the dataset or not. This indicates that AE-PRF is naturally suitable for dealing with imbalanced datasets. When compared with related methods, AE-PRF has relatively excellent performance in terms of accuracy, the true positive rate, the true negative rate, the Matthews correlation coefficient, and the area under the receiver operating characteristic curve.Tzu-Hsuan LinJehn-Ruey JiangMDPI AGarticleautoencodercredit carddeep learningfraud detectiondata imbalancerandom forestMathematicsQA1-939ENMathematics, Vol 9, Iss 2683, p 2683 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	autoencoder credit card deep learning fraud detection data imbalance random forest Mathematics QA1-939
spellingShingle	autoencoder credit card deep learning fraud detection data imbalance random forest Mathematics QA1-939 Tzu-Hsuan Lin Jehn-Ruey Jiang Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest
description	This paper proposes a method, called autoencoder with probabilistic random forest (AE-PRF), for detecting credit card frauds. The proposed AE-PRF method first utilizes the autoencoder to extract features of low-dimensionality from credit card transaction data features of high-dimensionality. It then relies on the random forest, an ensemble learning mechanism using the bootstrap aggregating (bagging) concept, with probabilistic classification to classify data as fraudulent or normal. The credit card fraud detection (CCFD) dataset is applied to AE-PRF for performance evaluation and comparison. The CCFD dataset contains large numbers of credit card transactions of European cardholders; it is highly imbalanced since its normal transactions far outnumber fraudulent transactions. Data resampling schemes like the synthetic minority oversampling technique (SMOTE), adaptive synthetic (ADASYN), and Tomek link (T-Link) are applied to the CCFD dataset to balance the numbers of normal and fraudulent transactions for improving AE-PRF performance. Experimental results show that the performance of AE-PRF does not vary much whether resampling schemes are applied to the dataset or not. This indicates that AE-PRF is naturally suitable for dealing with imbalanced datasets. When compared with related methods, AE-PRF has relatively excellent performance in terms of accuracy, the true positive rate, the true negative rate, the Matthews correlation coefficient, and the area under the receiver operating characteristic curve.
format	article
author	Tzu-Hsuan Lin Jehn-Ruey Jiang
author_facet	Tzu-Hsuan Lin Jehn-Ruey Jiang
author_sort	Tzu-Hsuan Lin
title	Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest
title_short	Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest
title_full	Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest
title_fullStr	Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest
title_full_unstemmed	Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest
title_sort	credit card fraud detection with autoencoder and probabilistic random forest
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/b9a662d924844c6490938fa55e24cbf8
work_keys_str_mv	AT tzuhsuanlin creditcardfrauddetectionwithautoencoderandprobabilisticrandomforest AT jehnrueyjiang creditcardfrauddetectionwithautoencoderandprobabilisticrandomforest
_version_	1718431906636234752

Credit Card Fraud Detection with Autoencoder and Probabilistic Random Forest

Ejemplares similares