Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems
This paper presents a new approach to generate datasets for cyber threat research in a multi-node system. For this purpose, the proof-of-concept of such a system is implemented. The system will be used to collect unique datasets with examples of information hiding techniques. These techniques are no...
Guardado en:
Autores principales: | , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/c2e10f9808da4a71a167022b5299b0bb |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:c2e10f9808da4a71a167022b5299b0bb |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:c2e10f9808da4a71a167022b5299b0bb2021-11-11T15:42:17ZDataset Generation for Development of Multi-Node Cyber Threat Detection Systems10.3390/electronics102127112079-9292https://doaj.org/article/c2e10f9808da4a71a167022b5299b0bb2021-11-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/21/2711https://doaj.org/toc/2079-9292This paper presents a new approach to generate datasets for cyber threat research in a multi-node system. For this purpose, the proof-of-concept of such a system is implemented. The system will be used to collect unique datasets with examples of information hiding techniques. These techniques are not present in publicly available cyber threat detection datasets, while the cyber threats that use them represent an emerging cyber defense challenge worldwide. The network data were collected thanks to the development of a dedicated application that automatically generates random network configurations and runs scenarios of information hiding techniques. The generated datasets were used in the data-driven research workflow for cyber threat detection, including the generation of data representations (network flows), feature selection based on correlations, data augmentation of training datasets, and preparation of machine learning classifiers based on Random Forest and Multilayer Perceptron architectures. The presented results show the usefulness and correctness of the design process to detect information hiding techniques. The challenges and research directions to detect cyber deception methods are discussed in general in the paper.Jędrzej BieniaszKrzysztof SzczypiorskiMDPI AGarticlecybersecuritydata sciencemachine learningdatasetscyber threats modelingmulti-agent systemsElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2711, p 2711 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
cybersecurity data science machine learning datasets cyber threats modeling multi-agent systems Electronics TK7800-8360 |
spellingShingle |
cybersecurity data science machine learning datasets cyber threats modeling multi-agent systems Electronics TK7800-8360 Jędrzej Bieniasz Krzysztof Szczypiorski Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems |
description |
This paper presents a new approach to generate datasets for cyber threat research in a multi-node system. For this purpose, the proof-of-concept of such a system is implemented. The system will be used to collect unique datasets with examples of information hiding techniques. These techniques are not present in publicly available cyber threat detection datasets, while the cyber threats that use them represent an emerging cyber defense challenge worldwide. The network data were collected thanks to the development of a dedicated application that automatically generates random network configurations and runs scenarios of information hiding techniques. The generated datasets were used in the data-driven research workflow for cyber threat detection, including the generation of data representations (network flows), feature selection based on correlations, data augmentation of training datasets, and preparation of machine learning classifiers based on Random Forest and Multilayer Perceptron architectures. The presented results show the usefulness and correctness of the design process to detect information hiding techniques. The challenges and research directions to detect cyber deception methods are discussed in general in the paper. |
format |
article |
author |
Jędrzej Bieniasz Krzysztof Szczypiorski |
author_facet |
Jędrzej Bieniasz Krzysztof Szczypiorski |
author_sort |
Jędrzej Bieniasz |
title |
Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems |
title_short |
Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems |
title_full |
Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems |
title_fullStr |
Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems |
title_full_unstemmed |
Dataset Generation for Development of Multi-Node Cyber Threat Detection Systems |
title_sort |
dataset generation for development of multi-node cyber threat detection systems |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/c2e10f9808da4a71a167022b5299b0bb |
work_keys_str_mv |
AT jedrzejbieniasz datasetgenerationfordevelopmentofmultinodecyberthreatdetectionsystems AT krzysztofszczypiorski datasetgenerationfordevelopmentofmultinodecyberthreatdetectionsystems |
_version_ |
1718434114726526976 |