An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm
The random forest (RF) algorithm is a typical representative of ensemble learning, which is widely used in rolling bearing fault diagnosis. In order to solve the problems of slower diagnosis speed and repeated voting of traditional RF algorithm in rolling bearing fault diagnosis under the big data e...
Guardado en:
Autores principales: | , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
IEEE
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/e1eb03d32f8d4c5cadc4ece4b943b137 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:e1eb03d32f8d4c5cadc4ece4b943b137 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:e1eb03d32f8d4c5cadc4ece4b943b1372021-11-20T00:00:35ZAn Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm2169-353610.1109/ACCESS.2021.3063929https://doaj.org/article/e1eb03d32f8d4c5cadc4ece4b943b1372021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9369361/https://doaj.org/toc/2169-3536The random forest (RF) algorithm is a typical representative of ensemble learning, which is widely used in rolling bearing fault diagnosis. In order to solve the problems of slower diagnosis speed and repeated voting of traditional RF algorithm in rolling bearing fault diagnosis under the big data environment, an efficient rolling bearing fault diagnosis method based on Spark and improved random forest (IRF) algorithm is proposed. By eliminating the decision trees with low classification accuracy and those prone to repeated voting in the original RF, an improved RF with faster diagnosis speed and higher classification accuracy is constructed. For the massive rolling bearing vibration data, in order to improve the training speed and diagnosis speed of the rolling bearing fault diagnosis model, the IRF algorithm is parallelized on the Spark platform. First, an original RF model is obtained by training multiple decision trees in parallel. Second, the decision trees with low classification accuracy in the original RF model are filtered. Third, all path information of the reserved decision trees is obtained in parallel. Fourth, a decision tree similarity matrix is constructed in parallel to eliminate the decision trees which are prone to repeated voting. Finally, an IRF model which can diagnose rolling bearing faults quickly and effectively is obtained. A series of experiments are carried out to evaluate the effectiveness of the proposed rolling bearing fault diagnosis method based on Spark and IRF algorithm. The results show that the proposed method can not only achieve good fault diagnosis accuracy, but also have fast model training speed and fault diagnosis speed for large-scale rolling bearing datasets.Lanjun WanKun GongGen ZhangXinpan YuanChangyun LiXiaojun DengIEEEarticleFault diagnosisrandom forestrolling bearingspark platformsub-forest optimizationElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 37866-37882 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Fault diagnosis random forest rolling bearing spark platform sub-forest optimization Electrical engineering. Electronics. Nuclear engineering TK1-9971 |
spellingShingle |
Fault diagnosis random forest rolling bearing spark platform sub-forest optimization Electrical engineering. Electronics. Nuclear engineering TK1-9971 Lanjun Wan Kun Gong Gen Zhang Xinpan Yuan Changyun Li Xiaojun Deng An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm |
description |
The random forest (RF) algorithm is a typical representative of ensemble learning, which is widely used in rolling bearing fault diagnosis. In order to solve the problems of slower diagnosis speed and repeated voting of traditional RF algorithm in rolling bearing fault diagnosis under the big data environment, an efficient rolling bearing fault diagnosis method based on Spark and improved random forest (IRF) algorithm is proposed. By eliminating the decision trees with low classification accuracy and those prone to repeated voting in the original RF, an improved RF with faster diagnosis speed and higher classification accuracy is constructed. For the massive rolling bearing vibration data, in order to improve the training speed and diagnosis speed of the rolling bearing fault diagnosis model, the IRF algorithm is parallelized on the Spark platform. First, an original RF model is obtained by training multiple decision trees in parallel. Second, the decision trees with low classification accuracy in the original RF model are filtered. Third, all path information of the reserved decision trees is obtained in parallel. Fourth, a decision tree similarity matrix is constructed in parallel to eliminate the decision trees which are prone to repeated voting. Finally, an IRF model which can diagnose rolling bearing faults quickly and effectively is obtained. A series of experiments are carried out to evaluate the effectiveness of the proposed rolling bearing fault diagnosis method based on Spark and IRF algorithm. The results show that the proposed method can not only achieve good fault diagnosis accuracy, but also have fast model training speed and fault diagnosis speed for large-scale rolling bearing datasets. |
format |
article |
author |
Lanjun Wan Kun Gong Gen Zhang Xinpan Yuan Changyun Li Xiaojun Deng |
author_facet |
Lanjun Wan Kun Gong Gen Zhang Xinpan Yuan Changyun Li Xiaojun Deng |
author_sort |
Lanjun Wan |
title |
An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm |
title_short |
An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm |
title_full |
An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm |
title_fullStr |
An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm |
title_full_unstemmed |
An Efficient Rolling Bearing Fault Diagnosis Method Based on Spark and Improved Random Forest Algorithm |
title_sort |
efficient rolling bearing fault diagnosis method based on spark and improved random forest algorithm |
publisher |
IEEE |
publishDate |
2021 |
url |
https://doaj.org/article/e1eb03d32f8d4c5cadc4ece4b943b137 |
work_keys_str_mv |
AT lanjunwan anefficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT kungong anefficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT genzhang anefficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT xinpanyuan anefficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT changyunli anefficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT xiaojundeng anefficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT lanjunwan efficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT kungong efficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT genzhang efficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT xinpanyuan efficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT changyunli efficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm AT xiaojundeng efficientrollingbearingfaultdiagnosismethodbasedonsparkandimprovedrandomforestalgorithm |
_version_ |
1718419869284696064 |