IT Infrastructure Anomaly Detection and Failure Handling: A Systematic Literature Review Focusing on Datasets, Log Preprocessing, Machine & Deep Learning Approaches and Automated Tool

Nowadays, reliability assurance is crucial in components of IT infrastructures. Unavailability of any element or connection results in downtime and triggers monetary and performance casualties. Thus, reliability engineering has been a topic of investigation recently. The system logs become obligator...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Deepali Arun Bhanage, Ambika Vishal Pawar, Ketan Kotecha
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/c71d523e624a4970a79610cc28617685
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:c71d523e624a4970a79610cc28617685
record_format dspace
spelling oai:doaj.org-article:c71d523e624a4970a79610cc286176852021-12-02T00:00:33ZIT Infrastructure Anomaly Detection and Failure Handling: A Systematic Literature Review Focusing on Datasets, Log Preprocessing, Machine & Deep Learning Approaches and Automated Tool2169-353610.1109/ACCESS.2021.3128283https://doaj.org/article/c71d523e624a4970a79610cc286176852021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9615039/https://doaj.org/toc/2169-3536Nowadays, reliability assurance is crucial in components of IT infrastructures. Unavailability of any element or connection results in downtime and triggers monetary and performance casualties. Thus, reliability engineering has been a topic of investigation recently. The system logs become obligatory in IT infrastructure monitoring for failure detection, root cause analysis, and troubleshooting. This Systematic Literature Review (SLR) focuses on detailed analysis based on the various qualitative and performance merits of datasets used, technical approaches utilized, and automated tools developed. The full-text review was directed by Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) methodology. 102 articles were extracted from Scopus, IEEE Explore, WoS, and ACM for a thorough examination. Also, a few more supplementary articles were studied by applying Snowballing technique. The study emphasizes the use of system logs for anomaly or failure detection and prediction. The survey encapsulates the automated tools under various quality merit criteria. This SLR ascertained that machine learning and deep learning-based classification approaches employed on selected features enable enhanced performance than traditional rule-based and method-based approaches. Additionally, the paper discusses research gaps in the existing literature and provides future research directions. The primary intent of this SLR is to perceive and inspect various tools and techniques proposed to mitigate IT infrastructure downtime in the existing literature. This survey will encourage prospective researchers to understand the pros and cons of current methods and pick an excellent approach to solve their identified problems in the field of IT infrastructure.Deepali Arun BhanageAmbika Vishal PawarKetan KotechaIEEEarticleIT infrastructure monitoringlog analysisfailure detectionfailure predictionmachine learningdeep learningElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 156392-156421 (2021)
institution DOAJ
collection DOAJ
language EN
topic IT infrastructure monitoring
log analysis
failure detection
failure prediction
machine learning
deep learning
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
spellingShingle IT infrastructure monitoring
log analysis
failure detection
failure prediction
machine learning
deep learning
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
Deepali Arun Bhanage
Ambika Vishal Pawar
Ketan Kotecha
IT Infrastructure Anomaly Detection and Failure Handling: A Systematic Literature Review Focusing on Datasets, Log Preprocessing, Machine & Deep Learning Approaches and Automated Tool
description Nowadays, reliability assurance is crucial in components of IT infrastructures. Unavailability of any element or connection results in downtime and triggers monetary and performance casualties. Thus, reliability engineering has been a topic of investigation recently. The system logs become obligatory in IT infrastructure monitoring for failure detection, root cause analysis, and troubleshooting. This Systematic Literature Review (SLR) focuses on detailed analysis based on the various qualitative and performance merits of datasets used, technical approaches utilized, and automated tools developed. The full-text review was directed by Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) methodology. 102 articles were extracted from Scopus, IEEE Explore, WoS, and ACM for a thorough examination. Also, a few more supplementary articles were studied by applying Snowballing technique. The study emphasizes the use of system logs for anomaly or failure detection and prediction. The survey encapsulates the automated tools under various quality merit criteria. This SLR ascertained that machine learning and deep learning-based classification approaches employed on selected features enable enhanced performance than traditional rule-based and method-based approaches. Additionally, the paper discusses research gaps in the existing literature and provides future research directions. The primary intent of this SLR is to perceive and inspect various tools and techniques proposed to mitigate IT infrastructure downtime in the existing literature. This survey will encourage prospective researchers to understand the pros and cons of current methods and pick an excellent approach to solve their identified problems in the field of IT infrastructure.
format article
author Deepali Arun Bhanage
Ambika Vishal Pawar
Ketan Kotecha
author_facet Deepali Arun Bhanage
Ambika Vishal Pawar
Ketan Kotecha
author_sort Deepali Arun Bhanage
title IT Infrastructure Anomaly Detection and Failure Handling: A Systematic Literature Review Focusing on Datasets, Log Preprocessing, Machine & Deep Learning Approaches and Automated Tool
title_short IT Infrastructure Anomaly Detection and Failure Handling: A Systematic Literature Review Focusing on Datasets, Log Preprocessing, Machine & Deep Learning Approaches and Automated Tool
title_full IT Infrastructure Anomaly Detection and Failure Handling: A Systematic Literature Review Focusing on Datasets, Log Preprocessing, Machine & Deep Learning Approaches and Automated Tool
title_fullStr IT Infrastructure Anomaly Detection and Failure Handling: A Systematic Literature Review Focusing on Datasets, Log Preprocessing, Machine & Deep Learning Approaches and Automated Tool
title_full_unstemmed IT Infrastructure Anomaly Detection and Failure Handling: A Systematic Literature Review Focusing on Datasets, Log Preprocessing, Machine & Deep Learning Approaches and Automated Tool
title_sort it infrastructure anomaly detection and failure handling: a systematic literature review focusing on datasets, log preprocessing, machine & deep learning approaches and automated tool
publisher IEEE
publishDate 2021
url https://doaj.org/article/c71d523e624a4970a79610cc28617685
work_keys_str_mv AT deepaliarunbhanage itinfrastructureanomalydetectionandfailurehandlingasystematicliteraturereviewfocusingondatasetslogpreprocessingmachinex0026deeplearningapproachesandautomatedtool
AT ambikavishalpawar itinfrastructureanomalydetectionandfailurehandlingasystematicliteraturereviewfocusingondatasetslogpreprocessingmachinex0026deeplearningapproachesandautomatedtool
AT ketankotecha itinfrastructureanomalydetectionandfailurehandlingasystematicliteraturereviewfocusingondatasetslogpreprocessingmachinex0026deeplearningapproachesandautomatedtool
_version_ 1718404012197281792