Feature Selection Based on Random Forest for Partial Discharges Characteristic Set

Since the dimension of combined feature set for partial discharge (PD) pattern recognition is higher, the corresponding sample size increases, as does the required amount of storage space and calculation, and there are features with less category-related characteristics in the feature parameters, wh...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Rui Yao, Jun Li, Meng Hui, Lin Bai, Qisheng Wu
Formato: article
Lenguaje:EN
Publicado: IEEE 2020
Materias:
Acceso en línea:https://doaj.org/article/83f09defe95346b182b4b8f5fc2d6564
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:83f09defe95346b182b4b8f5fc2d6564
record_format dspace
spelling oai:doaj.org-article:83f09defe95346b182b4b8f5fc2d65642021-11-19T00:04:57ZFeature Selection Based on Random Forest for Partial Discharges Characteristic Set2169-353610.1109/ACCESS.2020.3019377https://doaj.org/article/83f09defe95346b182b4b8f5fc2d65642020-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9177074/https://doaj.org/toc/2169-3536Since the dimension of combined feature set for partial discharge (PD) pattern recognition is higher, the corresponding sample size increases, as does the required amount of storage space and calculation, and there are features with less category-related characteristics in the feature parameters, which may contain redundant information between them. To solve the problem of higher feature dimension and complicated classification model required for the identification of partial discharge insulation defect type in this paper. Random forest sequential forward selection method based on variance analysis (RF-VA) is proposed for the optimal subset selection. This method is improved in two aspects. Firstly, a method based on variance analysis is proposed, which measures feature differences between categories, and obtains a modified arrangement displacement scheme to guide rearrangement of the order of values taken on data sample out of bag. Secondly, the sequence forward search method used to do feature selection could get iteration evaluation results, which solves randomness to determine the size of feature subset and instability of the results existing in the original algorithm. The results show RF-VA can obtain a better subset of features. It is feasible to reduce the dimension of partial discharge characteristic set, and effectively improve the identification rate of partial discharge defect type.Rui YaoJun LiMeng HuiLin BaiQisheng WuIEEEarticlePartial dischargefeature selectionvariance analysisrandom forestElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 8, Pp 159151-159161 (2020)
institution DOAJ
collection DOAJ
language EN
topic Partial discharge
feature selection
variance analysis
random forest
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
spellingShingle Partial discharge
feature selection
variance analysis
random forest
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
Rui Yao
Jun Li
Meng Hui
Lin Bai
Qisheng Wu
Feature Selection Based on Random Forest for Partial Discharges Characteristic Set
description Since the dimension of combined feature set for partial discharge (PD) pattern recognition is higher, the corresponding sample size increases, as does the required amount of storage space and calculation, and there are features with less category-related characteristics in the feature parameters, which may contain redundant information between them. To solve the problem of higher feature dimension and complicated classification model required for the identification of partial discharge insulation defect type in this paper. Random forest sequential forward selection method based on variance analysis (RF-VA) is proposed for the optimal subset selection. This method is improved in two aspects. Firstly, a method based on variance analysis is proposed, which measures feature differences between categories, and obtains a modified arrangement displacement scheme to guide rearrangement of the order of values taken on data sample out of bag. Secondly, the sequence forward search method used to do feature selection could get iteration evaluation results, which solves randomness to determine the size of feature subset and instability of the results existing in the original algorithm. The results show RF-VA can obtain a better subset of features. It is feasible to reduce the dimension of partial discharge characteristic set, and effectively improve the identification rate of partial discharge defect type.
format article
author Rui Yao
Jun Li
Meng Hui
Lin Bai
Qisheng Wu
author_facet Rui Yao
Jun Li
Meng Hui
Lin Bai
Qisheng Wu
author_sort Rui Yao
title Feature Selection Based on Random Forest for Partial Discharges Characteristic Set
title_short Feature Selection Based on Random Forest for Partial Discharges Characteristic Set
title_full Feature Selection Based on Random Forest for Partial Discharges Characteristic Set
title_fullStr Feature Selection Based on Random Forest for Partial Discharges Characteristic Set
title_full_unstemmed Feature Selection Based on Random Forest for Partial Discharges Characteristic Set
title_sort feature selection based on random forest for partial discharges characteristic set
publisher IEEE
publishDate 2020
url https://doaj.org/article/83f09defe95346b182b4b8f5fc2d6564
work_keys_str_mv AT ruiyao featureselectionbasedonrandomforestforpartialdischargescharacteristicset
AT junli featureselectionbasedonrandomforestforpartialdischargescharacteristicset
AT menghui featureselectionbasedonrandomforestforpartialdischargescharacteristicset
AT linbai featureselectionbasedonrandomforestforpartialdischargescharacteristicset
AT qishengwu featureselectionbasedonrandomforestforpartialdischargescharacteristicset
_version_ 1718420677596282880