Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems

This paper presents a new approach to generate datasets for cyber threat research in a multi-node system. For this purpose, the proof-of-concept of such a system is implemented. The system will be used to collect unique datasets with examples of information hiding techniques. These techniques are no...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Jędrzej Bieniasz, Krzysztof Szczypiorski
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	cybersecurity data science machine learning datasets cyber threats modeling multi-agent systems Electronics TK7800-8360
Acceso en línea:	https://doaj.org/article/c2e10f9808da4a71a167022b5299b0bb
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:c2e10f9808da4a71a167022b5299b0bb
record_format	dspace
spelling	oai:doaj.org-article:c2e10f9808da4a71a167022b5299b0bb2021-11-11T15:42:17ZDataset Generation for Development of Multi-Node Cyber Threat Detection Systems10.3390/electronics102127112079-9292https://doaj.org/article/c2e10f9808da4a71a167022b5299b0bb2021-11-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/21/2711https://doaj.org/toc/2079-9292This paper presents a new approach to generate datasets for cyber threat research in a multi-node system. For this purpose, the proof-of-concept of such a system is implemented. The system will be used to collect unique datasets with examples of information hiding techniques. These techniques are not present in publicly available cyber threat detection datasets, while the cyber threats that use them represent an emerging cyber defense challenge worldwide. The network data were collected thanks to the development of a dedicated application that automatically generates random network configurations and runs scenarios of information hiding techniques. The generated datasets were used in the data-driven research workflow for cyber threat detection, including the generation of data representations (network flows), feature selection based on correlations, data augmentation of training datasets, and preparation of machine learning classifiers based on Random Forest and Multilayer Perceptron architectures. The presented results show the usefulness and correctness of the design process to detect information hiding techniques. The challenges and research directions to detect cyber deception methods are discussed in general in the paper.Jędrzej BieniaszKrzysztof SzczypiorskiMDPI AGarticlecybersecuritydata sciencemachine learningdatasetscyber threats modelingmulti-agent systemsElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2711, p 2711 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	cybersecurity data science machine learning datasets cyber threats modeling multi-agent systems Electronics TK7800-8360
spellingShingle	cybersecurity data science machine learning datasets cyber threats modeling multi-agent systems Electronics TK7800-8360 Jędrzej Bieniasz Krzysztof Szczypiorski Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems
description	This paper presents a new approach to generate datasets for cyber threat research in a multi-node system. For this purpose, the proof-of-concept of such a system is implemented. The system will be used to collect unique datasets with examples of information hiding techniques. These techniques are not present in publicly available cyber threat detection datasets, while the cyber threats that use them represent an emerging cyber defense challenge worldwide. The network data were collected thanks to the development of a dedicated application that automatically generates random network configurations and runs scenarios of information hiding techniques. The generated datasets were used in the data-driven research workflow for cyber threat detection, including the generation of data representations (network flows), feature selection based on correlations, data augmentation of training datasets, and preparation of machine learning classifiers based on Random Forest and Multilayer Perceptron architectures. The presented results show the usefulness and correctness of the design process to detect information hiding techniques. The challenges and research directions to detect cyber deception methods are discussed in general in the paper.
format	article
author	Jędrzej Bieniasz Krzysztof Szczypiorski
author_facet	Jędrzej Bieniasz Krzysztof Szczypiorski
author_sort	Jędrzej Bieniasz
title	Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems
title_short	Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems
title_full	Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems
title_fullStr	Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems
title_full_unstemmed	Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems
title_sort	dataset generation for development of multi-node cyber threat detection systems
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/c2e10f9808da4a71a167022b5299b0bb
work_keys_str_mv	AT jedrzejbieniasz datasetgenerationfordevelopmentofmultinodecyberthreatdetectionsystems AT krzysztofszczypiorski datasetgenerationfordevelopmentofmultinodecyberthreatdetectionsystems
_version_	1718434114726526976

Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems

Ejemplares similares