T-DFNN: An Incremental Learning Algorithm for Intrusion Detection Systems

Machine learning has recently become a popular algorithm in building reliable intrusion detection systems (IDSs). However, most of the models are static and trained using datasets containing all targeted intrusions. If new intrusions emerge, these trained models must be retrained using old and new d...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Mahendra Data, Masayoshi Aritsugi
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/8f4706cd1bd04b1798cb58e41fd88fe9
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Machine learning has recently become a popular algorithm in building reliable intrusion detection systems (IDSs). However, most of the models are static and trained using datasets containing all targeted intrusions. If new intrusions emerge, these trained models must be retrained using old and new datasets to classify all intrusions accurately. In real-world situations, new threats continuously appear. Therefore, machine learning algorithms used for IDSs should have the ability to learn incrementally when these new intrusions emerge. To solve this issue, we propose T-DFNN. T-DFNN is an algorithm capable of learning new intrusions incrementally as they emerge. A T-DFNN model is composed of multiple deep feedforward neural network (DFNN) models connected in a tree-like structure. We examined our proposed algorithm using CICIDS2017, an open and widely used network intrusion dataset covering benign traffic and the most common network intrusions. The experimental results showed that the T-DFNN algorithm can incrementally learn new intrusions and reduce the catastrophic forgetting effect. The macro average of the F1-score of the T-DFNN model was over 0.85 for every retraining process. In addition, our proposed T-DFNN model has some advantages in several aspects compared to other models. Compared to the DFNN and Hoeffding tree models trained with a dataset containing only the latest targeted intrusions, our proposed T-DFNN model has higher F1-scores. Moreover, our proposed T-DFNN model has significantly shorter training times than a DFNN model trained using a dataset containing all targeted intrusions. Even though several factors can affect the duration of the training process, the T-DFNN algorithm shows promising results in solving the problem of ever-evolving network intrusion variants.