Google Play Content Scraping and Knowledge Engineering using Natural Language Processing Techniques with the Analysis of User Reviews

To maintain the competitive edge and evaluating the needs of the quality app is in the mobile application market. The user’s feedback on these applications plays an essential role in the mobile application development industry. The rapid growth of web technology gave people an opportunity to interac...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Aldabbas Hamza, Bajahzar Abdullah, Alruily Meshrif, Qureshi Ali Adil, Amir Latif Rana M., Farhan Muhammad
Formato: article
Lenguaje:EN
Publicado: De Gruyter 2020
Materias:
Q
Acceso en línea:https://doaj.org/article/9e3a8b38ae74488db5704ee4ad77c4c2
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:To maintain the competitive edge and evaluating the needs of the quality app is in the mobile application market. The user’s feedback on these applications plays an essential role in the mobile application development industry. The rapid growth of web technology gave people an opportunity to interact and express their review, rate and share their feedback about applications. In this paper we have scrapped 506259 of user reviews and applications rate from Google Play Store from 14 different categories. The statistical information was measured in the results using different of common machine learning algorithms such as the Logistic Regression, Random Forest Classifier, and Multinomial Naïve Bayes. Different parameters including the accuracy, precision, recall, and F1 score were used to evaluate Bigram, Trigram, and N-gram, and the statistical result of these algorithms was compared. The analysis of each algorithm, one by one, is performed, and the result has been evaluated. It is concluded that logistic regression is the best algorithm for review analysis of the Google Play Store applications. The results have been checked scientifically, and it is found that the accuracy of the logistic regression algorithm for analyzing different reviews based on three classes, i.e., positive, negative, and neutral.