Prediction of dengue incidence using search query surveillance.
<h4>Background</h4>The use of internet search data has been demonstrated to be effective at predicting influenza incidence. This approach may be more successful for dengue which has large variation in annual incidence and a more distinctive clinical presentation and mode of transmission....
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2011
|
Materias: | |
Acceso en línea: | https://doaj.org/article/9a90bbf1c9cb4598894b346a7cc837f8 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:9a90bbf1c9cb4598894b346a7cc837f8 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:9a90bbf1c9cb4598894b346a7cc837f82021-11-18T09:13:09ZPrediction of dengue incidence using search query surveillance.1935-27271935-273510.1371/journal.pntd.0001258https://doaj.org/article/9a90bbf1c9cb4598894b346a7cc837f82011-08-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21829744/?tool=EBIhttps://doaj.org/toc/1935-2727https://doaj.org/toc/1935-2735<h4>Background</h4>The use of internet search data has been demonstrated to be effective at predicting influenza incidence. This approach may be more successful for dengue which has large variation in annual incidence and a more distinctive clinical presentation and mode of transmission.<h4>Methods</h4>We gathered freely-available dengue incidence data from Singapore (weekly incidence, 2004-2011) and Bangkok (monthly incidence, 2004-2011). Internet search data for the same period were downloaded from Google Insights for Search. Search terms were chosen to reflect three categories of dengue-related search: nomenclature, signs/symptoms, and treatment. We compared three models to predict incidence: a step-down linear regression, generalized boosted regression, and negative binomial regression. Logistic regression and Support Vector Machine (SVM) models were used to predict a binary outcome defined by whether dengue incidence exceeded a chosen threshold. Incidence prediction models were assessed using r² and Pearson correlation between predicted and observed dengue incidence. Logistic and SVM model performance were assessed by the area under the receiver operating characteristic curve. Models were validated using multiple cross-validation techniques.<h4>Results</h4>The linear model selected by AIC step-down was found to be superior to other models considered. In Bangkok, the model has an r² = 0.943, and a correlation of 0.869 between fitted and observed. In Singapore, the model has an r² = 0.948, and a correlation of 0.931. In both Singapore and Bangkok, SVM models outperformed logistic regression in predicting periods of high incidence. The AUC for the SVM models using the 75th percentile cutoff is 0.906 in Singapore and 0.960 in Bangkok.<h4>Conclusions</h4>Internet search terms predict incidence and periods of large incidence of dengue with high accuracy and may prove useful in areas with underdeveloped surveillance systems. The methods presented here use freely available data and analysis tools and can be readily adapted to other settings.Benjamin M AlthouseYih Yng NgDerek A T CummingsPublic Library of Science (PLoS)articleArctic medicine. Tropical medicineRC955-962Public aspects of medicineRA1-1270ENPLoS Neglected Tropical Diseases, Vol 5, Iss 8, p e1258 (2011) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Arctic medicine. Tropical medicine RC955-962 Public aspects of medicine RA1-1270 |
spellingShingle |
Arctic medicine. Tropical medicine RC955-962 Public aspects of medicine RA1-1270 Benjamin M Althouse Yih Yng Ng Derek A T Cummings Prediction of dengue incidence using search query surveillance. |
description |
<h4>Background</h4>The use of internet search data has been demonstrated to be effective at predicting influenza incidence. This approach may be more successful for dengue which has large variation in annual incidence and a more distinctive clinical presentation and mode of transmission.<h4>Methods</h4>We gathered freely-available dengue incidence data from Singapore (weekly incidence, 2004-2011) and Bangkok (monthly incidence, 2004-2011). Internet search data for the same period were downloaded from Google Insights for Search. Search terms were chosen to reflect three categories of dengue-related search: nomenclature, signs/symptoms, and treatment. We compared three models to predict incidence: a step-down linear regression, generalized boosted regression, and negative binomial regression. Logistic regression and Support Vector Machine (SVM) models were used to predict a binary outcome defined by whether dengue incidence exceeded a chosen threshold. Incidence prediction models were assessed using r² and Pearson correlation between predicted and observed dengue incidence. Logistic and SVM model performance were assessed by the area under the receiver operating characteristic curve. Models were validated using multiple cross-validation techniques.<h4>Results</h4>The linear model selected by AIC step-down was found to be superior to other models considered. In Bangkok, the model has an r² = 0.943, and a correlation of 0.869 between fitted and observed. In Singapore, the model has an r² = 0.948, and a correlation of 0.931. In both Singapore and Bangkok, SVM models outperformed logistic regression in predicting periods of high incidence. The AUC for the SVM models using the 75th percentile cutoff is 0.906 in Singapore and 0.960 in Bangkok.<h4>Conclusions</h4>Internet search terms predict incidence and periods of large incidence of dengue with high accuracy and may prove useful in areas with underdeveloped surveillance systems. The methods presented here use freely available data and analysis tools and can be readily adapted to other settings. |
format |
article |
author |
Benjamin M Althouse Yih Yng Ng Derek A T Cummings |
author_facet |
Benjamin M Althouse Yih Yng Ng Derek A T Cummings |
author_sort |
Benjamin M Althouse |
title |
Prediction of dengue incidence using search query surveillance. |
title_short |
Prediction of dengue incidence using search query surveillance. |
title_full |
Prediction of dengue incidence using search query surveillance. |
title_fullStr |
Prediction of dengue incidence using search query surveillance. |
title_full_unstemmed |
Prediction of dengue incidence using search query surveillance. |
title_sort |
prediction of dengue incidence using search query surveillance. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2011 |
url |
https://doaj.org/article/9a90bbf1c9cb4598894b346a7cc837f8 |
work_keys_str_mv |
AT benjaminmalthouse predictionofdengueincidenceusingsearchquerysurveillance AT yihyngng predictionofdengueincidenceusingsearchquerysurveillance AT derekatcummings predictionofdengueincidenceusingsearchquerysurveillance |
_version_ |
1718420980247822336 |