Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques

The rise in web and social media interactions has resulted in the efortless proliferation of offensive language and hate speech. Such online harassment, insults, and attacks are commonly termed cyberbullying. The sheer volume of user-generated content has made it challenging to identify such illicit...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Chahat Raj, Ayush Agarwal, Gnana Bharathy, Bhuva Narayan, Mukesh Prasad
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	cyberbullying hate speech offensive language machine learning neural networks deep learning Electronics TK7800-8360
Acceso en línea:	https://doaj.org/article/b7ac124425df485898e1232de3eb6bf3
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:b7ac124425df485898e1232de3eb6bf3
record_format	dspace
spelling	oai:doaj.org-article:b7ac124425df485898e1232de3eb6bf32021-11-25T17:24:50ZCyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques10.3390/electronics102228102079-9292https://doaj.org/article/b7ac124425df485898e1232de3eb6bf32021-11-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/22/2810https://doaj.org/toc/2079-9292The rise in web and social media interactions has resulted in the efortless proliferation of offensive language and hate speech. Such online harassment, insults, and attacks are commonly termed cyberbullying. The sheer volume of user-generated content has made it challenging to identify such illicit content. Machine learning has wide applications in text classification, and researchers are shifting towards using deep neural networks in detecting cyberbullying due to the several advantages they have over traditional machine learning algorithms. This paper proposes a novel neural network framework with parameter optimization and an algorithmic comparative study of eleven classification methods: four traditional machine learning and seven shallow neural networks on two real world cyberbullying datasets. In addition, this paper also examines the effect of feature extraction and word-embedding-techniques-based natural language processing on algorithmic performance. Key observations from this study show that bidirectional neural networks and attention models provide high classification results. Logistic Regression was observed to be the best among the traditional machine learning classifiers used. Term Frequency-Inverse Document Frequency (TF-IDF) demonstrates consistently high accuracies with traditional machine learning techniques. Global Vectors (GloVe) perform better with neural network models. Bi-GRU and Bi-LSTM worked best amongst the neural networks used. The extensive experiments performed on the two datasets establish the importance of this work by comparing eleven classification methods and seven feature extraction techniques. Our proposed shallow neural networks outperform existing state-of-the-art approaches for cyberbullying detection, with accuracy and F1-scores as high as ~95% and ~98%, respectively.Chahat RajAyush AgarwalGnana BharathyBhuva NarayanMukesh PrasadMDPI AGarticlecyberbullyinghate speechoffensive languagemachine learningneural networksdeep learningElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2810, p 2810 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	cyberbullying hate speech offensive language machine learning neural networks deep learning Electronics TK7800-8360
spellingShingle	cyberbullying hate speech offensive language machine learning neural networks deep learning Electronics TK7800-8360 Chahat Raj Ayush Agarwal Gnana Bharathy Bhuva Narayan Mukesh Prasad Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques
description	The rise in web and social media interactions has resulted in the efortless proliferation of offensive language and hate speech. Such online harassment, insults, and attacks are commonly termed cyberbullying. The sheer volume of user-generated content has made it challenging to identify such illicit content. Machine learning has wide applications in text classification, and researchers are shifting towards using deep neural networks in detecting cyberbullying due to the several advantages they have over traditional machine learning algorithms. This paper proposes a novel neural network framework with parameter optimization and an algorithmic comparative study of eleven classification methods: four traditional machine learning and seven shallow neural networks on two real world cyberbullying datasets. In addition, this paper also examines the effect of feature extraction and word-embedding-techniques-based natural language processing on algorithmic performance. Key observations from this study show that bidirectional neural networks and attention models provide high classification results. Logistic Regression was observed to be the best among the traditional machine learning classifiers used. Term Frequency-Inverse Document Frequency (TF-IDF) demonstrates consistently high accuracies with traditional machine learning techniques. Global Vectors (GloVe) perform better with neural network models. Bi-GRU and Bi-LSTM worked best amongst the neural networks used. The extensive experiments performed on the two datasets establish the importance of this work by comparing eleven classification methods and seven feature extraction techniques. Our proposed shallow neural networks outperform existing state-of-the-art approaches for cyberbullying detection, with accuracy and F1-scores as high as ~95% and ~98%, respectively.
format	article
author	Chahat Raj Ayush Agarwal Gnana Bharathy Bhuva Narayan Mukesh Prasad
author_facet	Chahat Raj Ayush Agarwal Gnana Bharathy Bhuva Narayan Mukesh Prasad
author_sort	Chahat Raj
title	Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques
title_short	Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques
title_full	Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques
title_fullStr	Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques
title_full_unstemmed	Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques
title_sort	cyberbullying detection: hybrid models based on machine learning and natural language processing techniques
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/b7ac124425df485898e1232de3eb6bf3
work_keys_str_mv	AT chahatraj cyberbullyingdetectionhybridmodelsbasedonmachinelearningandnaturallanguageprocessingtechniques AT ayushagarwal cyberbullyingdetectionhybridmodelsbasedonmachinelearningandnaturallanguageprocessingtechniques AT gnanabharathy cyberbullyingdetectionhybridmodelsbasedonmachinelearningandnaturallanguageprocessingtechniques AT bhuvanarayan cyberbullyingdetectionhybridmodelsbasedonmachinelearningandnaturallanguageprocessingtechniques AT mukeshprasad cyberbullyingdetectionhybridmodelsbasedonmachinelearningandnaturallanguageprocessingtechniques
_version_	1718412431166799872

Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques

Ejemplares similares