Improving sentiment analysis accuracy with emoji embedding

Due to the diversity and variability of Chinese syntax and semantics, accurately identifying and distinguishing individual emotions from online texts is challenging. To overcome this limitation, we incorporate a new source of individual sentiment, emojis, which contain thousands of graphic symbols a...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Chuchu Liu, Fan Fang, Xu Lin, Tie Cai, Xu Tan, Jianguo Liu, Xin Lu
Formato:	article
Lenguaje:	EN
Publicado:	KeAi Communications Co., Ltd. 2021
Materias:	Sentiment analysis Emoji CEmo-LSTM Sentiment evolution COVID-19 Risk in industry. Risk management HD61
Acceso en línea:	https://doaj.org/article/38a347918b224dd0a226ef038b10de03
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:38a347918b224dd0a226ef038b10de03
record_format	dspace
spelling	oai:doaj.org-article:38a347918b224dd0a226ef038b10de032021-11-10T04:41:44ZImproving sentiment analysis accuracy with emoji embedding2666-449610.1016/j.jnlssr.2021.10.003https://doaj.org/article/38a347918b224dd0a226ef038b10de032021-12-01T00:00:00Zhttp://www.sciencedirect.com/science/article/pii/S2666449621000529https://doaj.org/toc/2666-4496Due to the diversity and variability of Chinese syntax and semantics, accurately identifying and distinguishing individual emotions from online texts is challenging. To overcome this limitation, we incorporate a new source of individual sentiment, emojis, which contain thousands of graphic symbols and are increasingly being used for expressing emotion in online conversations. We examined popular sentiment analysis algorithms, including rule-based and classification algorithms, to evaluate the impact of supplementing emojis as additional features to improve the algorithm performance. Emojis were also translated into corresponding sentiment words when constructing features for comparison with those directly generated from emoji label words. In addition, considering different functions of emojis in texts, we classified all posts in the dataset by their emoji usage and examined the changes in algorithm performance. We found that emojis are effective as expanding features for improving the accuracy of sentiment analysis algorithms, and the algorithm performance can be further increased by taking different emoji usages into consideration. In this study, we developed an improved emoji-embedding model based on Bi-LSTM (namely, CEmo-LSTM), which achieves the highest accuracy (around 0.95) when analyzing online Chinese texts. We applied the CEmo-LSTM algorithm to a large dataset collected from Weibo from December 1, 2019 to March 20, 2020 to understand the sentiment evolution of online users during the COVID-19 pandemic. We found that the pandemic remarkably impacted individual sentiments and caused more passive emotions (e.g., horror and sadness). Our novel emoji-embedding algorithm creatively combined emojis as well as emoji usage with the sentiment analysis model and can handle emotion mining tasks more effectively and efficiently.Chuchu LiuFan FangXu LinTie CaiXu TanJianguo LiuXin LuKeAi Communications Co., Ltd.articleSentiment analysisEmojiCEmo-LSTMSentiment evolutionCOVID-19Risk in industry. Risk managementHD61ENJournal of Safety Science and Resilience, Vol 2, Iss 4, Pp 246-252 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Sentiment analysis Emoji CEmo-LSTM Sentiment evolution COVID-19 Risk in industry. Risk management HD61
spellingShingle	Sentiment analysis Emoji CEmo-LSTM Sentiment evolution COVID-19 Risk in industry. Risk management HD61 Chuchu Liu Fan Fang Xu Lin Tie Cai Xu Tan Jianguo Liu Xin Lu Improving sentiment analysis accuracy with emoji embedding
description	Due to the diversity and variability of Chinese syntax and semantics, accurately identifying and distinguishing individual emotions from online texts is challenging. To overcome this limitation, we incorporate a new source of individual sentiment, emojis, which contain thousands of graphic symbols and are increasingly being used for expressing emotion in online conversations. We examined popular sentiment analysis algorithms, including rule-based and classification algorithms, to evaluate the impact of supplementing emojis as additional features to improve the algorithm performance. Emojis were also translated into corresponding sentiment words when constructing features for comparison with those directly generated from emoji label words. In addition, considering different functions of emojis in texts, we classified all posts in the dataset by their emoji usage and examined the changes in algorithm performance. We found that emojis are effective as expanding features for improving the accuracy of sentiment analysis algorithms, and the algorithm performance can be further increased by taking different emoji usages into consideration. In this study, we developed an improved emoji-embedding model based on Bi-LSTM (namely, CEmo-LSTM), which achieves the highest accuracy (around 0.95) when analyzing online Chinese texts. We applied the CEmo-LSTM algorithm to a large dataset collected from Weibo from December 1, 2019 to March 20, 2020 to understand the sentiment evolution of online users during the COVID-19 pandemic. We found that the pandemic remarkably impacted individual sentiments and caused more passive emotions (e.g., horror and sadness). Our novel emoji-embedding algorithm creatively combined emojis as well as emoji usage with the sentiment analysis model and can handle emotion mining tasks more effectively and efficiently.
format	article
author	Chuchu Liu Fan Fang Xu Lin Tie Cai Xu Tan Jianguo Liu Xin Lu
author_facet	Chuchu Liu Fan Fang Xu Lin Tie Cai Xu Tan Jianguo Liu Xin Lu
author_sort	Chuchu Liu
title	Improving sentiment analysis accuracy with emoji embedding
title_short	Improving sentiment analysis accuracy with emoji embedding
title_full	Improving sentiment analysis accuracy with emoji embedding
title_fullStr	Improving sentiment analysis accuracy with emoji embedding
title_full_unstemmed	Improving sentiment analysis accuracy with emoji embedding
title_sort	improving sentiment analysis accuracy with emoji embedding
publisher	KeAi Communications Co., Ltd.
publishDate	2021
url	https://doaj.org/article/38a347918b224dd0a226ef038b10de03
work_keys_str_mv	AT chuchuliu improvingsentimentanalysisaccuracywithemojiembedding AT fanfang improvingsentimentanalysisaccuracywithemojiembedding AT xulin improvingsentimentanalysisaccuracywithemojiembedding AT tiecai improvingsentimentanalysisaccuracywithemojiembedding AT xutan improvingsentimentanalysisaccuracywithemojiembedding AT jianguoliu improvingsentimentanalysisaccuracywithemojiembedding AT xinlu improvingsentimentanalysisaccuracywithemojiembedding
_version_	1718440543329976320

Improving sentiment analysis accuracy with emoji embedding

Ejemplares similares