Asthma-prone areas modeling using a machine learning model

Abstract Nowadays, owing to population growth, increasing environmental pollution, and lifestyle changes, the number of asthmatics has significantly increased. Therefore, the purpose of our study was to determine the asthma-prone areas in Tehran, Iran considering environmental, spatial factors. Init...

Full description

Saved in:
Bibliographic Details
Main Authors: Seyed Vahid Razavi-Termeh, Abolghasem Sadeghi-Niaraki, Soo-Mi Choi
Format: article
Language:EN
Published: Nature Portfolio 2021
Subjects:
R
Q
Online Access:https://doaj.org/article/8d6decd62e484710947f6961f8c0eafd
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Nowadays, owing to population growth, increasing environmental pollution, and lifestyle changes, the number of asthmatics has significantly increased. Therefore, the purpose of our study was to determine the asthma-prone areas in Tehran, Iran considering environmental, spatial factors. Initially, we built a spatial database using 872 locations of children with asthma and 13 environmental factors affecting the disease—distance to parks and streets, rainfall, temperature, humidity, pressure, wind speed, particulate matter (PM 10 and PM 2.5), ozone (O3), sulfur dioxide (SO2), carbon monoxide (CO), and nitrogen dioxide (NO2). Subsequently, utilizing this spatial database, a random forest (RF) machine learning model, and a geographic information system, we prepared a map of asthma-prone areas. For modeling and validation, we deployed 70% and 30%, respectively, of the locations of children with asthma. The results of spatial autocorrelation and RF model showed that the criteria of distance to parks and streets as well as PM 2.5 and PM 10 had the greatest impact on asthma occurrence in the study area. Spatial autocorrelation analyses indicated that the distribution of asthma cases was not random. According to receiver operating characteristic results, the RF model had good accuracy (the area under the curve was 0.987 and 0.921, respectively, for training and testing data).