Air Pollutant Concentration Prediction Based on a CEEMDAN-FE-BiLSTM Model

The concentration series of PM<sub>2.5</sub> (particulate matter ≤ 2.5 μm) is nonlinear, nonstationary, and noisy, making it difficult to predict accurately. This paper presents a new PM<sub>2.5</sub> concentration prediction method based on a hybrid model of complete ensembl...

Description complète

Enregistré dans:
Détails bibliographiques
Auteurs principaux: Xuchu Jiang, Peiyao Wei, Yiwen Luo, Ying Li
Format: article
Langue:EN
Publié: MDPI AG 2021
Sujets:
FE
Accès en ligne:https://doaj.org/article/236884f9cfc947ae826f6af5349f0bbc
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
Description
Résumé:The concentration series of PM<sub>2.5</sub> (particulate matter ≤ 2.5 μm) is nonlinear, nonstationary, and noisy, making it difficult to predict accurately. This paper presents a new PM<sub>2.5</sub> concentration prediction method based on a hybrid model of complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and bi-directional long short-term memory (BiLSTM). The new method was applied to predict the same kind of particulate pollutant PM<sub>10</sub> and heterogeneous gas pollutant O<sub>3</sub>, proving that the prediction method has strong generalization ability. First, CEEMDAN was used to decompose PM<sub>2.5</sub> concentrations at different frequencies. Then, the fuzzy entropy (FE) value of each decomposed wave was calculated, and the near waves were combined by K-means clustering to generate the input sequence. Finally, the combined sequences were put into the BiLSTM model with multiple hidden layers for training. We predicted the PM<sub>2.5</sub> concentrations of Seoul Station 116 by the hour, with values of the root mean square error (<i>RMSE</i>), the mean absolute error (<i>MAE</i>), and the symmetric mean absolute percentage error (<i>SMAPE</i>) as low as 2.74, 1.90, and 13.59%, respectively, and an <i>R</i><sup>2</sup> value as high as 96.34%. The “CEEMDAN-FE” decomposition-merging technology proposed in this paper can effectively reduce the instability and high volatility of the original data, overcome data noise, and significantly improve the model’s performance in predicting the real-time concentrations of PM<sub>2.5</sub>.