Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias
Film business and its individual reviews cannot be separated and film review sites such as IMDb is a credible source of reviews posted in public forums. With IMDb site reviews being unstructured and bias-heavy, classification methods by reducing additional sentiment bias is needed to create a balanc...
Guardado en:
Autores principales: | , |
---|---|
Formato: | article |
Lenguaje: | ID |
Publicado: |
Ikatan Ahli Indormatika Indonesia
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/66590540a5d9479e997e0bddb74956b7 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:66590540a5d9479e997e0bddb74956b7 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:66590540a5d9479e997e0bddb74956b72021-11-16T13:16:12ZSentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias2580-076010.29207/resti.v5i5.3400https://doaj.org/article/66590540a5d9479e997e0bddb74956b72021-10-01T00:00:00Zhttp://jurnal.iaii.or.id/index.php/RESTI/article/view/3400https://doaj.org/toc/2580-0760Film business and its individual reviews cannot be separated and film review sites such as IMDb is a credible source of reviews posted in public forums. With IMDb site reviews being unstructured and bias-heavy, classification methods by reducing additional sentiment bias is needed to create a balanced classification with lower polarity bias. Elimination of additional sentiment bias will improve the model as polarity is defined by non-bias method, resulting in models correctly defined which sequences of words is either positive or negative. This research limits the dataset by 50.000 rows of randomly extracted reviews from the IMDb website using dataset preparation methods such as Preprocessing, POS-Tagging, and Word Embeddings. Then preprocessed data is used in classification methods such as ANN, SWN, and SO-Cal. This paper also used bias processing methods such as Hyperparameter Tuning and BPM, with outputs evaluated using Accuracy and PBR metrics. This research yields 77.39 % for ANN, 66.32% for BPM, 75.6% for SO-Cal, and 76.26% for Hybrid classification. Best PBR resulted in two lexicon-based methods on 0.0009 for BPM, and 0.00006 for SO-Cal. More advanced model configuration in ANN can improve the model, and much complex lexicon models will be a future in the research topic.Fery Ardiansyah EffendiYuliant SibaroniIkatan Ahli Indormatika Indonesiaarticlesentiment classification, machine learning, ann, lexicon-based method, bat, so-cal.Systems engineeringTA168Information technologyT58.5-58.64IDJurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol 5, Iss 5, Pp 863-875 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
ID |
topic |
sentiment classification, machine learning, ann, lexicon-based method, bat, so-cal. Systems engineering TA168 Information technology T58.5-58.64 |
spellingShingle |
sentiment classification, machine learning, ann, lexicon-based method, bat, so-cal. Systems engineering TA168 Information technology T58.5-58.64 Fery Ardiansyah Effendi Yuliant Sibaroni Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias |
description |
Film business and its individual reviews cannot be separated and film review sites such as IMDb is a credible source of reviews posted in public forums. With IMDb site reviews being unstructured and bias-heavy, classification methods by reducing additional sentiment bias is needed to create a balanced classification with lower polarity bias. Elimination of additional sentiment bias will improve the model as polarity is defined by non-bias method, resulting in models correctly defined which sequences of words is either positive or negative. This research limits the dataset by 50.000 rows of randomly extracted reviews from the IMDb website using dataset preparation methods such as Preprocessing, POS-Tagging, and Word Embeddings. Then preprocessed data is used in classification methods such as ANN, SWN, and SO-Cal. This paper also used bias processing methods such as Hyperparameter Tuning and BPM, with outputs evaluated using Accuracy and PBR metrics. This research yields 77.39 % for ANN, 66.32% for BPM, 75.6% for SO-Cal, and 76.26% for Hybrid classification. Best PBR resulted in two lexicon-based methods on 0.0009 for BPM, and 0.00006 for SO-Cal. More advanced model configuration in ANN can improve the model, and much complex lexicon models will be a future in the research topic. |
format |
article |
author |
Fery Ardiansyah Effendi Yuliant Sibaroni |
author_facet |
Fery Ardiansyah Effendi Yuliant Sibaroni |
author_sort |
Fery Ardiansyah Effendi |
title |
Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias |
title_short |
Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias |
title_full |
Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias |
title_fullStr |
Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias |
title_full_unstemmed |
Sentiment Classification for Film Reviews by Reducing Additional Introduced Sentiment Bias |
title_sort |
sentiment classification for film reviews by reducing additional introduced sentiment bias |
publisher |
Ikatan Ahli Indormatika Indonesia |
publishDate |
2021 |
url |
https://doaj.org/article/66590540a5d9479e997e0bddb74956b7 |
work_keys_str_mv |
AT feryardiansyaheffendi sentimentclassificationforfilmreviewsbyreducingadditionalintroducedsentimentbias AT yuliantsibaroni sentimentclassificationforfilmreviewsbyreducingadditionalintroducedsentimentbias |
_version_ |
1718426484874412032 |