Automatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR7
The search for special and rare celestial objects has always played an important role in astronomy. Cataclysmic Variables (CVs) are special and rare binary systems with accretion disks. Most CVs are in the quiescent period, and their spectra have the emission lines of Balmer series, HeI, and HeII. A...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/088ac8d3c2a343fe8b432d296f7ecc4b |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:088ac8d3c2a343fe8b432d296f7ecc4b |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:088ac8d3c2a343fe8b432d296f7ecc4b2021-11-25T19:09:47ZAutomatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR710.3390/universe71104382218-1997https://doaj.org/article/088ac8d3c2a343fe8b432d296f7ecc4b2021-11-01T00:00:00Zhttps://www.mdpi.com/2218-1997/7/11/438https://doaj.org/toc/2218-1997The search for special and rare celestial objects has always played an important role in astronomy. Cataclysmic Variables (CVs) are special and rare binary systems with accretion disks. Most CVs are in the quiescent period, and their spectra have the emission lines of Balmer series, HeI, and HeII. A few CVs in the outburst period have the absorption lines of Balmer series. Owing to the scarcity of numbers, expanding the spectral data of CVs is of positive significance for studying the formation of accretion disks and the evolution of binary star system models. At present, the research for astronomical spectra has entered the era of Big Data. The Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST) has produced more than tens of millions of spectral data. the latest released LAMOST-DR7 includes 10.6 million low-resolution spectral data in 4926 sky regions, providing ideal data support for searching CV candidates. To process and analyze the massive amounts of spectral data, this study employed the Light Gradient Boosting Machine (LightGBM) algorithm, which is based on the ensemble tree model to automatically conduct the search in LAMOST-DR7. Finally, 225 CV candidates were found and four new CV candidates were verified by SIMBAD and published catalogs. This study also built the Gradient Boosting Decision Tree (GBDT), Adaptive Boosting (AdaBoost), and eXtreme Gradient Boosting (XGBoost) models and used Accuracy, Precision, Recall, the F1-score, and the ROC curve to compare the four models comprehensively. Experimental results showed that LightGBM is more efficient. The search for CVs based on LightGBM not only enriches the existing CV spectral library, but also provides a reference for the data mining of other rare celestial objects in massive spectral data.Zhiyuan HuJianyu ChenBin JiangWenyu WangMDPI AGarticlesky surveycataclysmic variablesLightGBMdata miningElementary particle physicsQC793-793.5ENUniverse, Vol 7, Iss 438, p 438 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
sky survey cataclysmic variables LightGBM data mining Elementary particle physics QC793-793.5 |
spellingShingle |
sky survey cataclysmic variables LightGBM data mining Elementary particle physics QC793-793.5 Zhiyuan Hu Jianyu Chen Bin Jiang Wenyu Wang Automatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR7 |
description |
The search for special and rare celestial objects has always played an important role in astronomy. Cataclysmic Variables (CVs) are special and rare binary systems with accretion disks. Most CVs are in the quiescent period, and their spectra have the emission lines of Balmer series, HeI, and HeII. A few CVs in the outburst period have the absorption lines of Balmer series. Owing to the scarcity of numbers, expanding the spectral data of CVs is of positive significance for studying the formation of accretion disks and the evolution of binary star system models. At present, the research for astronomical spectra has entered the era of Big Data. The Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST) has produced more than tens of millions of spectral data. the latest released LAMOST-DR7 includes 10.6 million low-resolution spectral data in 4926 sky regions, providing ideal data support for searching CV candidates. To process and analyze the massive amounts of spectral data, this study employed the Light Gradient Boosting Machine (LightGBM) algorithm, which is based on the ensemble tree model to automatically conduct the search in LAMOST-DR7. Finally, 225 CV candidates were found and four new CV candidates were verified by SIMBAD and published catalogs. This study also built the Gradient Boosting Decision Tree (GBDT), Adaptive Boosting (AdaBoost), and eXtreme Gradient Boosting (XGBoost) models and used Accuracy, Precision, Recall, the F1-score, and the ROC curve to compare the four models comprehensively. Experimental results showed that LightGBM is more efficient. The search for CVs based on LightGBM not only enriches the existing CV spectral library, but also provides a reference for the data mining of other rare celestial objects in massive spectral data. |
format |
article |
author |
Zhiyuan Hu Jianyu Chen Bin Jiang Wenyu Wang |
author_facet |
Zhiyuan Hu Jianyu Chen Bin Jiang Wenyu Wang |
author_sort |
Zhiyuan Hu |
title |
Automatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR7 |
title_short |
Automatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR7 |
title_full |
Automatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR7 |
title_fullStr |
Automatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR7 |
title_full_unstemmed |
Automatic Search of Cataclysmic Variables Based on LightGBM in LAMOST-DR7 |
title_sort |
automatic search of cataclysmic variables based on lightgbm in lamost-dr7 |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/088ac8d3c2a343fe8b432d296f7ecc4b |
work_keys_str_mv |
AT zhiyuanhu automaticsearchofcataclysmicvariablesbasedonlightgbminlamostdr7 AT jianyuchen automaticsearchofcataclysmicvariablesbasedonlightgbminlamostdr7 AT binjiang automaticsearchofcataclysmicvariablesbasedonlightgbminlamostdr7 AT wenyuwang automaticsearchofcataclysmicvariablesbasedonlightgbminlamostdr7 |
_version_ |
1718410195910000640 |