Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification

With the advancements in processing units and easy availability of cloud-based GPU servers, many deep learning-based methods have been proposed for Aspect Level Sentiment Classification (ALSC) literature. With this increase in the number of deep learning methods proposed in ALSC literature, it has b...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Tanu Sharma, Kamaldeep Kaur
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
T
Acceso en línea:https://doaj.org/article/f1366d1f1c8e487da592fa4f26e241b8
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f1366d1f1c8e487da592fa4f26e241b8
record_format dspace
spelling oai:doaj.org-article:f1366d1f1c8e487da592fa4f26e241b82021-11-25T16:31:07ZBenchmarking Deep Learning Methods for Aspect Level Sentiment Classification10.3390/app1122105422076-3417https://doaj.org/article/f1366d1f1c8e487da592fa4f26e241b82021-11-01T00:00:00Zhttps://www.mdpi.com/2076-3417/11/22/10542https://doaj.org/toc/2076-3417With the advancements in processing units and easy availability of cloud-based GPU servers, many deep learning-based methods have been proposed for Aspect Level Sentiment Classification (ALSC) literature. With this increase in the number of deep learning methods proposed in ALSC literature, it has become difficult to ascertain the performance difference of one method over the other. To this end, our study provides a statistical comparison of the performance of 35 recent deep learning methods with respect to three performance metrics-Accuracy, Macro F1 score, and Time. The methods are evaluated for eight benchmark datasets. In this study, the statistical comparison is based on Friedman, Nemenyi, and Wilcoxon tests. As per the results of statistical tests, the top-ranking methods could not significantly outperform several other methods in terms of Accuracy and Macro F1 score and performed poorly on-time metric. However, the time taken by any method is crucial to analyze the overall performance. Thus, this study aids the selection of the Deep Learning method, which maximizes the accuracy and Macro F1 score and takes minimal time. Our study also establishes a framework for validating the performance of new and alternate methods in ALSC that can be helpful for researchers and practitioners working in this area.Tanu SharmaKamaldeep KaurMDPI AGarticleaspect based sentiment analysis (ABSA)aspect level sentiment classification (ALSC)deep learningtarget dependent sentiment classificationneural networksstatistical testsTechnologyTEngineering (General). Civil engineering (General)TA1-2040Biology (General)QH301-705.5PhysicsQC1-999ChemistryQD1-999ENApplied Sciences, Vol 11, Iss 10542, p 10542 (2021)
institution DOAJ
collection DOAJ
language EN
topic aspect based sentiment analysis (ABSA)
aspect level sentiment classification (ALSC)
deep learning
target dependent sentiment classification
neural networks
statistical tests
Technology
T
Engineering (General). Civil engineering (General)
TA1-2040
Biology (General)
QH301-705.5
Physics
QC1-999
Chemistry
QD1-999
spellingShingle aspect based sentiment analysis (ABSA)
aspect level sentiment classification (ALSC)
deep learning
target dependent sentiment classification
neural networks
statistical tests
Technology
T
Engineering (General). Civil engineering (General)
TA1-2040
Biology (General)
QH301-705.5
Physics
QC1-999
Chemistry
QD1-999
Tanu Sharma
Kamaldeep Kaur
Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification
description With the advancements in processing units and easy availability of cloud-based GPU servers, many deep learning-based methods have been proposed for Aspect Level Sentiment Classification (ALSC) literature. With this increase in the number of deep learning methods proposed in ALSC literature, it has become difficult to ascertain the performance difference of one method over the other. To this end, our study provides a statistical comparison of the performance of 35 recent deep learning methods with respect to three performance metrics-Accuracy, Macro F1 score, and Time. The methods are evaluated for eight benchmark datasets. In this study, the statistical comparison is based on Friedman, Nemenyi, and Wilcoxon tests. As per the results of statistical tests, the top-ranking methods could not significantly outperform several other methods in terms of Accuracy and Macro F1 score and performed poorly on-time metric. However, the time taken by any method is crucial to analyze the overall performance. Thus, this study aids the selection of the Deep Learning method, which maximizes the accuracy and Macro F1 score and takes minimal time. Our study also establishes a framework for validating the performance of new and alternate methods in ALSC that can be helpful for researchers and practitioners working in this area.
format article
author Tanu Sharma
Kamaldeep Kaur
author_facet Tanu Sharma
Kamaldeep Kaur
author_sort Tanu Sharma
title Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification
title_short Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification
title_full Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification
title_fullStr Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification
title_full_unstemmed Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification
title_sort benchmarking deep learning methods for aspect level sentiment classification
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/f1366d1f1c8e487da592fa4f26e241b8
work_keys_str_mv AT tanusharma benchmarkingdeeplearningmethodsforaspectlevelsentimentclassification
AT kamaldeepkaur benchmarkingdeeplearningmethodsforaspectlevelsentimentclassification
_version_ 1718413174802219008