Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications
The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, an evaluation of the current proposals in a l...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/01a7984f65744068a6f891df26d94ed9 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:01a7984f65744068a6f891df26d94ed9 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:01a7984f65744068a6f891df26d94ed92021-11-25T18:17:50ZCan Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications10.3390/math92229882227-7390https://doaj.org/article/01a7984f65744068a6f891df26d94ed92021-11-01T00:00:00Zhttps://www.mdpi.com/2227-7390/9/22/2988https://doaj.org/toc/2227-7390The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, an evaluation of the current proposals in a long-term scenario where the topics discussed may change is lacking. In this work, we deviate from current approaches to the problem and instead focus on a longitudinal evaluation using social network publications spanning an 18-month period. We evaluate different combinations of features and supervised models in a long-term scenario where the training and testing data are ordered chronologically, and thus the robustness and stability of the models can be evaluated through time. We experimented with 3 different scenarios where the models are trained with 15-, 30-, and 60-day data periods. The results show that detection models trained with word-embedding features are the ones that perform better and are less likely to be affected by the change of topics (for example, the rise of COVID-19 conspiracy theories). Furthermore, the additional days of training data also increase the performance of the best feature/model combinations, although not very significantly (around 2%). The results presented in this paper build the foundations towards a more pragmatic approach to the evaluation of fake news detection models in social networks.Nuno GuimarãesÁlvaro FigueiraLuís TorgoMDPI AGarticlefake news detectionsocial networksfalse informationmachine learningdata miningMathematicsQA1-939ENMathematics, Vol 9, Iss 2988, p 2988 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
fake news detection social networks false information machine learning data mining Mathematics QA1-939 |
spellingShingle |
fake news detection social networks false information machine learning data mining Mathematics QA1-939 Nuno Guimarães Álvaro Figueira Luís Torgo Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications |
description |
The negative impact of false information on social networks is rapidly growing. Current research on the topic focused on the detection of fake news in a particular context or event (such as elections) or using data from a short period of time. Therefore, an evaluation of the current proposals in a long-term scenario where the topics discussed may change is lacking. In this work, we deviate from current approaches to the problem and instead focus on a longitudinal evaluation using social network publications spanning an 18-month period. We evaluate different combinations of features and supervised models in a long-term scenario where the training and testing data are ordered chronologically, and thus the robustness and stability of the models can be evaluated through time. We experimented with 3 different scenarios where the models are trained with 15-, 30-, and 60-day data periods. The results show that detection models trained with word-embedding features are the ones that perform better and are less likely to be affected by the change of topics (for example, the rise of COVID-19 conspiracy theories). Furthermore, the additional days of training data also increase the performance of the best feature/model combinations, although not very significantly (around 2%). The results presented in this paper build the foundations towards a more pragmatic approach to the evaluation of fake news detection models in social networks. |
format |
article |
author |
Nuno Guimarães Álvaro Figueira Luís Torgo |
author_facet |
Nuno Guimarães Álvaro Figueira Luís Torgo |
author_sort |
Nuno Guimarães |
title |
Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications |
title_short |
Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications |
title_full |
Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications |
title_fullStr |
Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications |
title_full_unstemmed |
Can Fake News Detection Models Maintain the Performance through Time? A Longitudinal Evaluation of Twitter Publications |
title_sort |
can fake news detection models maintain the performance through time? a longitudinal evaluation of twitter publications |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/01a7984f65744068a6f891df26d94ed9 |
work_keys_str_mv |
AT nunoguimaraes canfakenewsdetectionmodelsmaintaintheperformancethroughtimealongitudinalevaluationoftwitterpublications AT alvarofigueira canfakenewsdetectionmodelsmaintaintheperformancethroughtimealongitudinalevaluationoftwitterpublications AT luistorgo canfakenewsdetectionmodelsmaintaintheperformancethroughtimealongitudinalevaluationoftwitterpublications |
_version_ |
1718411358427414528 |