How to Effectively Collect and Process Network Data for Intrusion Detection?

The number of security breaches in the cyberspace is on the rise. This threat is met with intensive work in the intrusion detection research community. To keep the defensive mechanisms up to date and relevant, realistic network traffic datasets are needed. The use of flow-based data for machine-lear...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Mikołaj Komisarek, Marek Pawlicki, Rafał Kozik, Witold Hołubowicz, Michał Choraś
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	NetFlow network intrusion detection network behavior analysis data quality feature selection Science Q Astrophysics QB460-466 Physics QC1-999
Acceso en línea:	https://doaj.org/article/e9dc1dc7b51440d9be1d59c2133f58f4
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:e9dc1dc7b51440d9be1d59c2133f58f4
record_format	dspace
spelling	oai:doaj.org-article:e9dc1dc7b51440d9be1d59c2133f58f42021-11-25T17:30:39ZHow to Effectively Collect and Process Network Data for Intrusion Detection?10.3390/e231115321099-4300https://doaj.org/article/e9dc1dc7b51440d9be1d59c2133f58f42021-11-01T00:00:00Zhttps://www.mdpi.com/1099-4300/23/11/1532https://doaj.org/toc/1099-4300The number of security breaches in the cyberspace is on the rise. This threat is met with intensive work in the intrusion detection research community. To keep the defensive mechanisms up to date and relevant, realistic network traffic datasets are needed. The use of flow-based data for machine-learning-based network intrusion detection is a promising direction for intrusion detection systems. However, many contemporary benchmark datasets do not contain features that are usable in the wild. The main contribution of this work is to cover the research gap related to identifying and investigating valuable features in the NetFlow schema that allow for effective, machine-learning-based network intrusion detection in the real world. To achieve this goal, several feature selection techniques have been applied on five flow-based network intrusion detection datasets, establishing an informative flow-based feature set. The authors’ experience with the deployment of this kind of system shows that to close the research-to-market gap, and to perform actual real-world application of machine-learning-based intrusion detection, a set of labeled data from the end-user has to be collected. This research aims at establishing the appropriate, minimal amount of data that is sufficient to effectively train machine learning algorithms in intrusion detection. The results show that a set of 10 features and a small amount of data is enough for the final model to perform very well.Mikołaj KomisarekMarek PawlickiRafał KozikWitold HołubowiczMichał ChoraśMDPI AGarticleNetFlownetwork intrusion detectionnetwork behavior analysisdata qualityfeature selectionScienceQAstrophysicsQB460-466PhysicsQC1-999ENEntropy, Vol 23, Iss 1532, p 1532 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	NetFlow network intrusion detection network behavior analysis data quality feature selection Science Q Astrophysics QB460-466 Physics QC1-999
spellingShingle	NetFlow network intrusion detection network behavior analysis data quality feature selection Science Q Astrophysics QB460-466 Physics QC1-999 Mikołaj Komisarek Marek Pawlicki Rafał Kozik Witold Hołubowicz Michał Choraś How to Effectively Collect and Process Network Data for Intrusion Detection?
description	The number of security breaches in the cyberspace is on the rise. This threat is met with intensive work in the intrusion detection research community. To keep the defensive mechanisms up to date and relevant, realistic network traffic datasets are needed. The use of flow-based data for machine-learning-based network intrusion detection is a promising direction for intrusion detection systems. However, many contemporary benchmark datasets do not contain features that are usable in the wild. The main contribution of this work is to cover the research gap related to identifying and investigating valuable features in the NetFlow schema that allow for effective, machine-learning-based network intrusion detection in the real world. To achieve this goal, several feature selection techniques have been applied on five flow-based network intrusion detection datasets, establishing an informative flow-based feature set. The authors’ experience with the deployment of this kind of system shows that to close the research-to-market gap, and to perform actual real-world application of machine-learning-based intrusion detection, a set of labeled data from the end-user has to be collected. This research aims at establishing the appropriate, minimal amount of data that is sufficient to effectively train machine learning algorithms in intrusion detection. The results show that a set of 10 features and a small amount of data is enough for the final model to perform very well.
format	article
author	Mikołaj Komisarek Marek Pawlicki Rafał Kozik Witold Hołubowicz Michał Choraś
author_facet	Mikołaj Komisarek Marek Pawlicki Rafał Kozik Witold Hołubowicz Michał Choraś
author_sort	Mikołaj Komisarek
title	How to Effectively Collect and Process Network Data for Intrusion Detection?
title_short	How to Effectively Collect and Process Network Data for Intrusion Detection?
title_full	How to Effectively Collect and Process Network Data for Intrusion Detection?
title_fullStr	How to Effectively Collect and Process Network Data for Intrusion Detection?
title_full_unstemmed	How to Effectively Collect and Process Network Data for Intrusion Detection?
title_sort	how to effectively collect and process network data for intrusion detection?
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/e9dc1dc7b51440d9be1d59c2133f58f4
work_keys_str_mv	AT mikołajkomisarek howtoeffectivelycollectandprocessnetworkdataforintrusiondetection AT marekpawlicki howtoeffectivelycollectandprocessnetworkdataforintrusiondetection AT rafałkozik howtoeffectivelycollectandprocessnetworkdataforintrusiondetection AT witoldhołubowicz howtoeffectivelycollectandprocessnetworkdataforintrusiondetection AT michałchoras howtoeffectivelycollectandprocessnetworkdataforintrusiondetection
_version_	1718412319260672000

How to Effectively Collect and Process Network Data for Intrusion Detection?

Ejemplares similares