Multi-view learning for software defect prediction

Background: Traditionally, machine learning algorithms have been simply applied for software defect prediction by considering single-view data, meaning the input data contains a single feature vector. Nevertheless, different software engineering data sources may include multiple and partially indep...

Description complète

Enregistré dans:
Détails bibliographiques
Auteurs principaux: Elife Ozturk Kiyak, Derya Birant, Kokten Ulas Birant
Format: article
Langue:EN
Publié: Wroclaw University of Science and Technology 2021
Sujets:
Accès en ligne:https://doaj.org/article/24ab5bfe8ea24ec68f62a57a46c2184d
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
Description
Résumé:Background: Traditionally, machine learning algorithms have been simply applied for software defect prediction by considering single-view data, meaning the input data contains a single feature vector. Nevertheless, different software engineering data sources may include multiple and partially independent information, which makes the standard single-view approaches ineffective. Objective: In order to overcome the single-view limitation in the current studies, this article proposes the usage of a multi-view learning method for software defect classification problems. Method: The Multi-View k-Nearest Neighbors (MVKNN) method was used in the software engineering field. In this method, first, base classifiers are constructed to learn from each view, and then classifiers are combined to create a robust multi-view model. Results: In the experimental studies, our algorithm (MVKNN) is compared with the standard k-nearest neighbors (KNN) algorithm on 50 datasets obtained from different software bug repositories. The experimental results demonstrate that the MVKNN method outperformed KNN on most of the datasets in terms of accuracy. The average accuracy values of MVKNN are 86.59%, 88.09%, and 83.10% for the NASA MDP, Softlab, and OSSP datasets, respectively. Conclusion: The results show that using multiple views (MVKNN) can usually improve classification accuracy compared to a single-view strategy (KNN) for software defect prediction.