Comparing quantile regression methods for probabilistic forecasting of NO2 pollution levels

Abstract High concentration episodes for NO2 are increasingly dealt with by authorities through traffic restrictions which are activated when air quality deteriorates beyond certain thresholds. Foreseeing the probability that pollutant concentrations reach those thresholds becomes thus a necessity....

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Sebastien Pérez Vasseur, José L. Aznarte
Formato:	article
Lenguaje:	EN
Publicado:	Nature Portfolio 2021
Materias:	Medicine R Science Q
Acceso en línea:	https://doaj.org/article/9334a8d6092643f6a2ce0bf4cee80d06
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:9334a8d6092643f6a2ce0bf4cee80d06
record_format	dspace
spelling	oai:doaj.org-article:9334a8d6092643f6a2ce0bf4cee80d062021-12-02T15:57:12ZComparing quantile regression methods for probabilistic forecasting of NO2 pollution levels10.1038/s41598-021-90063-32045-2322https://doaj.org/article/9334a8d6092643f6a2ce0bf4cee80d062021-06-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-90063-3https://doaj.org/toc/2045-2322Abstract High concentration episodes for NO2 are increasingly dealt with by authorities through traffic restrictions which are activated when air quality deteriorates beyond certain thresholds. Foreseeing the probability that pollutant concentrations reach those thresholds becomes thus a necessity. Probabilistic forecasting, as oposed to point-forecasting, is a family of techniques that allow for the prediction of the expected distribution function instead of a single future value. In the case of NO2, it allows for the calculation of future chances of exceeding thresholds and to detect pollution peaks. However, there is a lack of comparative studies for probabilistic models in the field of air pollution. In this work, we thoroughly compared 10 state of the art quantile regression models, using them to predict the distribution of NO2 concentrations in a urban location for a set of forecasting horizons (up to 60 hours into the future). Instead of using directly the quantiles, we derived from them the parameters of a predicted distribution, rendering this method semi-parametric. Amongst the models tested, quantile gradient boosted trees show the best performance, yielding the best results for both expected point value and full distribution. However, we found the simpler quantile k-nearest neighbors combined with a linear regression provided similar results with much lower training time and complexity.Sebastien Pérez VasseurJosé L. AznarteNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-8 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Sebastien Pérez Vasseur José L. Aznarte Comparing quantile regression methods for probabilistic forecasting of NO2 pollution levels
description	Abstract High concentration episodes for NO2 are increasingly dealt with by authorities through traffic restrictions which are activated when air quality deteriorates beyond certain thresholds. Foreseeing the probability that pollutant concentrations reach those thresholds becomes thus a necessity. Probabilistic forecasting, as oposed to point-forecasting, is a family of techniques that allow for the prediction of the expected distribution function instead of a single future value. In the case of NO2, it allows for the calculation of future chances of exceeding thresholds and to detect pollution peaks. However, there is a lack of comparative studies for probabilistic models in the field of air pollution. In this work, we thoroughly compared 10 state of the art quantile regression models, using them to predict the distribution of NO2 concentrations in a urban location for a set of forecasting horizons (up to 60 hours into the future). Instead of using directly the quantiles, we derived from them the parameters of a predicted distribution, rendering this method semi-parametric. Amongst the models tested, quantile gradient boosted trees show the best performance, yielding the best results for both expected point value and full distribution. However, we found the simpler quantile k-nearest neighbors combined with a linear regression provided similar results with much lower training time and complexity.
format	article
author	Sebastien Pérez Vasseur José L. Aznarte
author_facet	Sebastien Pérez Vasseur José L. Aznarte
author_sort	Sebastien Pérez Vasseur
title	Comparing quantile regression methods for probabilistic forecasting of NO2 pollution levels
title_short	Comparing quantile regression methods for probabilistic forecasting of NO2 pollution levels
title_full	Comparing quantile regression methods for probabilistic forecasting of NO2 pollution levels
title_fullStr	Comparing quantile regression methods for probabilistic forecasting of NO2 pollution levels
title_full_unstemmed	Comparing quantile regression methods for probabilistic forecasting of NO2 pollution levels
title_sort	comparing quantile regression methods for probabilistic forecasting of no2 pollution levels
publisher	Nature Portfolio
publishDate	2021
url	https://doaj.org/article/9334a8d6092643f6a2ce0bf4cee80d06
work_keys_str_mv	AT sebastienperezvasseur comparingquantileregressionmethodsforprobabilisticforecastingofno2pollutionlevels AT joselaznarte comparingquantileregressionmethodsforprobabilisticforecastingofno2pollutionlevels
_version_	1718385341240442880

Comparing quantile regression methods for probabilistic forecasting of NO2 pollution levels

Ejemplares similares