Topic-Based Document-Level Sentiment Analysis Using Contextual Cues

Document-level Sentiment Analysis is a complex task that implies the analysis of large textual content that can incorporate multiple contradictory polarities at the phrase and word levels. Most of the current approaches either represent textual data using pre-trained word embeddings without consider...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ciprian-Octavian Truică, Elena-Simona Apostol, Maria-Luiza Șerban, Adrian Paschke
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/fcc251ef6f8741c1bd58e33288986d24
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Document-level Sentiment Analysis is a complex task that implies the analysis of large textual content that can incorporate multiple contradictory polarities at the phrase and word levels. Most of the current approaches either represent textual data using pre-trained word embeddings without considering the local context that can be extracted from the dataset, or they detect the overall topic polarity without considering both the local and global context. In this paper, we propose a novel document-topic embedding model, <span style="font-variant: small-caps;">DocTopic</span>2<span style="font-variant: small-caps;">Vec</span>, for document-level polarity detection in large texts by employing general and specific contextual cues obtained through the use of document embeddings (<span style="font-variant: small-caps;">Doc</span>2<span style="font-variant: small-caps;">Vec</span>) and Topic Modeling. In our approach, <i>(1)</i> we use a large dataset with game reviews to create different word embeddings by applying <span style="font-variant: small-caps;">Word</span>2<span style="font-variant: small-caps;">Vec</span>, <span style="font-variant: small-caps;">FastText</span>, and <span style="font-variant: small-caps;">GloVe</span>, <i>(2)</i> we create <span style="font-variant: small-caps;">Doc</span>2<span style="font-variant: small-caps;">Vec</span>s enriched with the local context given by the word embeddings for each review, <i>(3)</i> we construct topic embeddings <span style="font-variant: small-caps;">Topic</span>2<span style="font-variant: small-caps;">Vec</span> using three Topic Modeling algorithms, i.e., LDA, NMF, and LSI, to enhance the global context of the Sentiment Analysis task, <i>(4)</i> for each document and its dominant topic, we build the new <span style="font-variant: small-caps;">DocTopic</span>2<span style="font-variant: small-caps;">Vec</span> by concatenating the <span style="font-variant: small-caps;">Doc</span>2<span style="font-variant: small-caps;">Vec</span> with the <span style="font-variant: small-caps;">Topic</span>2<span style="font-variant: small-caps;">Vec</span> created with the same word embedding. We also design six new Convolutional-based (Bidirectional) Recurrent Deep Neural Network Architectures that show promising results for this task. The proposed <span style="font-variant: small-caps;">DocTopic</span>2<span style="font-variant: small-caps;">Vec</span>s are used to benchmark multiple Machine and Deep Learning models, i.e., a Logistic Regression model, used as a baseline, and 18 Deep Neural Networks Architectures. The experimental results show that the new embedding and the new Deep Neural Network Architectures achieve better results than the baseline, i.e., Logistic Regression and <span style="font-variant: small-caps;">Doc</span>2<span style="font-variant: small-caps;">Vec</span>.