Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias

Film business and its individual reviews cannot be separated and film review sites such as IMDb is a credible source of reviews posted in public forums. With IMDb site reviews being unstructured and bias-heavy, classification methods by reducing additional sentiment bias is needed to create a balanc...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Fery Ardiansyah Effendi, Yuliant Sibaroni
Formato: article
Lenguaje:ID
Publicado: Ikatan Ahli Indormatika Indonesia 2021
Materias:
Acceso en línea:https://doaj.org/article/66590540a5d9479e997e0bddb74956b7
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:66590540a5d9479e997e0bddb74956b7
record_format dspace
spelling oai:doaj.org-article:66590540a5d9479e997e0bddb74956b72021-11-16T13:16:12ZSentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias2580-076010.29207/resti.v5i5.3400https://doaj.org/article/66590540a5d9479e997e0bddb74956b72021-10-01T00:00:00Zhttp://jurnal.iaii.or.id/index.php/RESTI/article/view/3400https://doaj.org/toc/2580-0760Film business and its individual reviews cannot be separated and film review sites such as IMDb is a credible source of reviews posted in public forums. With IMDb site reviews being unstructured and bias-heavy, classification methods by reducing additional sentiment bias is needed to create a balanced classification with lower polarity bias. Elimination of additional sentiment bias will improve the model as polarity is defined by non-bias method, resulting in models correctly defined which sequences of words is either positive or negative. This research limits the dataset by 50.000 rows of randomly extracted reviews from the IMDb website using dataset preparation methods such as Preprocessing, POS-Tagging, and Word Embeddings. Then preprocessed data is used in classification methods such as ANN, SWN, and SO-Cal. This paper also used bias processing methods such as Hyperparameter Tuning and BPM, with outputs evaluated using Accuracy and PBR metrics. This research yields 77.39 % for ANN, 66.32% for BPM,  75.6% for SO-Cal, and 76.26% for Hybrid classification. Best PBR resulted in two lexicon-based methods on 0.0009 for BPM, and 0.00006 for SO-Cal. More advanced model configuration in ANN can improve the model, and much complex lexicon models will be a future in the research topic.Fery Ardiansyah EffendiYuliant SibaroniIkatan Ahli Indormatika Indonesiaarticlesentiment classification, machine learning, ann, lexicon-based method, bat, so-cal.Systems engineeringTA168Information technologyT58.5-58.64IDJurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol 5, Iss 5, Pp 863-875 (2021)
institution DOAJ
collection DOAJ
language ID
topic sentiment classification, machine learning, ann, lexicon-based method, bat, so-cal.
Systems engineering
TA168
Information technology
T58.5-58.64
spellingShingle sentiment classification, machine learning, ann, lexicon-based method, bat, so-cal.
Systems engineering
TA168
Information technology
T58.5-58.64
Fery Ardiansyah Effendi
Yuliant Sibaroni
Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias
description Film business and its individual reviews cannot be separated and film review sites such as IMDb is a credible source of reviews posted in public forums. With IMDb site reviews being unstructured and bias-heavy, classification methods by reducing additional sentiment bias is needed to create a balanced classification with lower polarity bias. Elimination of additional sentiment bias will improve the model as polarity is defined by non-bias method, resulting in models correctly defined which sequences of words is either positive or negative. This research limits the dataset by 50.000 rows of randomly extracted reviews from the IMDb website using dataset preparation methods such as Preprocessing, POS-Tagging, and Word Embeddings. Then preprocessed data is used in classification methods such as ANN, SWN, and SO-Cal. This paper also used bias processing methods such as Hyperparameter Tuning and BPM, with outputs evaluated using Accuracy and PBR metrics. This research yields 77.39 % for ANN, 66.32% for BPM,  75.6% for SO-Cal, and 76.26% for Hybrid classification. Best PBR resulted in two lexicon-based methods on 0.0009 for BPM, and 0.00006 for SO-Cal. More advanced model configuration in ANN can improve the model, and much complex lexicon models will be a future in the research topic.
format article
author Fery Ardiansyah Effendi
Yuliant Sibaroni
author_facet Fery Ardiansyah Effendi
Yuliant Sibaroni
author_sort Fery Ardiansyah Effendi
title Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias
title_short Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias
title_full Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias
title_fullStr Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias
title_full_unstemmed Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias
title_sort sentiment classification for film reviews by reducing additional introduced sentiment bias
publisher Ikatan Ahli Indormatika Indonesia
publishDate 2021
url https://doaj.org/article/66590540a5d9479e997e0bddb74956b7
work_keys_str_mv AT feryardiansyaheffendi sentimentclassificationforfilmreviewsbyreducingadditionalintroducedsentimentbias
AT yuliantsibaroni sentimentclassificationforfilmreviewsbyreducingadditionalintroducedsentimentbias
_version_ 1718426484874412032