IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection

With advancements in engineering and science, the application of smart systems is increasing, generating a faster growth of the IoT network traffic. The limitations due to IoT restricted power and computing devices also raise concerns about security vulnerabilities. Machine learning-based techniques...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Laura Vigoya, Diego Fernandez, Victor Carneiro, Francisco J. Nóvoa
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
IoT
Acceso en línea:https://doaj.org/article/2af890f3c1fe49feb1715305b98ff4c3
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:2af890f3c1fe49feb1715305b98ff4c3
record_format dspace
spelling oai:doaj.org-article:2af890f3c1fe49feb1715305b98ff4c32021-11-25T17:25:22ZIoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection10.3390/electronics102228572079-9292https://doaj.org/article/2af890f3c1fe49feb1715305b98ff4c32021-11-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/22/2857https://doaj.org/toc/2079-9292With advancements in engineering and science, the application of smart systems is increasing, generating a faster growth of the IoT network traffic. The limitations due to IoT restricted power and computing devices also raise concerns about security vulnerabilities. Machine learning-based techniques have recently gained credibility in a successful application for the detection of network anomalies, including IoT networks. However, machine learning techniques cannot work without representative data. Given the scarcity of IoT datasets, the DAD emerged as an instrument for knowing the behavior of dedicated IoT-MQTT networks. This paper aims to validate the DAD dataset by applying Logistic Regression, Naive Bayes, Random Forest, AdaBoost, and Support Vector Machine to detect traffic anomalies in IoT. To obtain the best results, techniques for handling unbalanced data, feature selection, and grid search for hyperparameter optimization have been used. The experimental results show that the proposed dataset can achieve a high detection rate in all the experiments, providing the best mean accuracy of 0.99 for the tree-based models, with a low false-positive rate, ensuring effective anomaly detection.Laura VigoyaDiego FernandezVictor CarneiroFrancisco J. NóvoaMDPI AGarticleIoTsensorsdataset validationmachine learningintrusion detection systemsanalysisElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2857, p 2857 (2021)
institution DOAJ
collection DOAJ
language EN
topic IoT
sensors
dataset validation
machine learning
intrusion detection systems
analysis
Electronics
TK7800-8360
spellingShingle IoT
sensors
dataset validation
machine learning
intrusion detection systems
analysis
Electronics
TK7800-8360
Laura Vigoya
Diego Fernandez
Victor Carneiro
Francisco J. Nóvoa
IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection
description With advancements in engineering and science, the application of smart systems is increasing, generating a faster growth of the IoT network traffic. The limitations due to IoT restricted power and computing devices also raise concerns about security vulnerabilities. Machine learning-based techniques have recently gained credibility in a successful application for the detection of network anomalies, including IoT networks. However, machine learning techniques cannot work without representative data. Given the scarcity of IoT datasets, the DAD emerged as an instrument for knowing the behavior of dedicated IoT-MQTT networks. This paper aims to validate the DAD dataset by applying Logistic Regression, Naive Bayes, Random Forest, AdaBoost, and Support Vector Machine to detect traffic anomalies in IoT. To obtain the best results, techniques for handling unbalanced data, feature selection, and grid search for hyperparameter optimization have been used. The experimental results show that the proposed dataset can achieve a high detection rate in all the experiments, providing the best mean accuracy of 0.99 for the tree-based models, with a low false-positive rate, ensuring effective anomaly detection.
format article
author Laura Vigoya
Diego Fernandez
Victor Carneiro
Francisco J. Nóvoa
author_facet Laura Vigoya
Diego Fernandez
Victor Carneiro
Francisco J. Nóvoa
author_sort Laura Vigoya
title IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection
title_short IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection
title_full IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection
title_fullStr IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection
title_full_unstemmed IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection
title_sort iot dataset validation using machine learning techniques for traffic anomaly detection
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/2af890f3c1fe49feb1715305b98ff4c3
work_keys_str_mv AT lauravigoya iotdatasetvalidationusingmachinelearningtechniquesfortrafficanomalydetection
AT diegofernandez iotdatasetvalidationusingmachinelearningtechniquesfortrafficanomalydetection
AT victorcarneiro iotdatasetvalidationusingmachinelearningtechniquesfortrafficanomalydetection
AT franciscojnovoa iotdatasetvalidationusingmachinelearningtechniquesfortrafficanomalydetection
_version_ 1718412339435274240