Document similarity for error prediction

In today's rushing world, there's an ever-increasing usage of networking equipment. These devices log their operations; however, there could be errors that result in the restart of the given device. There could be different patterns before different errors. Our main goal is to predict the...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Péter Marjai, Péter Lehotay-Kéry, Attila Kiss
Formato: article
Lenguaje:EN
Publicado: Taylor & Francis Group 2021
Materias:
Acceso en línea:https://doaj.org/article/fb0325fcab3947a4ac4a7c7934c0c359
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:fb0325fcab3947a4ac4a7c7934c0c359
record_format dspace
spelling oai:doaj.org-article:fb0325fcab3947a4ac4a7c7934c0c3592021-11-17T14:22:00ZDocument similarity for error prediction2475-18392475-184710.1080/24751839.2021.1893496https://doaj.org/article/fb0325fcab3947a4ac4a7c7934c0c3592021-10-01T00:00:00Zhttp://dx.doi.org/10.1080/24751839.2021.1893496https://doaj.org/toc/2475-1839https://doaj.org/toc/2475-1847In today's rushing world, there's an ever-increasing usage of networking equipment. These devices log their operations; however, there could be errors that result in the restart of the given device. There could be different patterns before different errors. Our main goal is to predict the upcoming error based on the log lines of the actual file. To achieve this, we use document similarity. One of the key concepts of information retrieval is document similarity which is an indicator of how analogous (or different) documents are. In this paper, we are studying the effectiveness of prediction based on cosine similarity, Jaccard similarity, and Euclidean distance of rows before restarts. We use different features like TFIDF, Doc2Vec, LSH, and others in conjunction with these distance measures. Since networking devices produce lots of log files, we use Spark for Big data computing.Péter MarjaiPéter Lehotay-KéryAttila KissTaylor & Francis Grouparticledocument similarityerror predictionsparTelecommunicationTK5101-6720Information technologyT58.5-58.64ENJournal of Information and Telecommunication, Vol 5, Iss 4, Pp 407-420 (2021)
institution DOAJ
collection DOAJ
language EN
topic document similarity
error prediction
spar
Telecommunication
TK5101-6720
Information technology
T58.5-58.64
spellingShingle document similarity
error prediction
spar
Telecommunication
TK5101-6720
Information technology
T58.5-58.64
Péter Marjai
Péter Lehotay-Kéry
Attila Kiss
Document similarity for error prediction
description In today's rushing world, there's an ever-increasing usage of networking equipment. These devices log their operations; however, there could be errors that result in the restart of the given device. There could be different patterns before different errors. Our main goal is to predict the upcoming error based on the log lines of the actual file. To achieve this, we use document similarity. One of the key concepts of information retrieval is document similarity which is an indicator of how analogous (or different) documents are. In this paper, we are studying the effectiveness of prediction based on cosine similarity, Jaccard similarity, and Euclidean distance of rows before restarts. We use different features like TFIDF, Doc2Vec, LSH, and others in conjunction with these distance measures. Since networking devices produce lots of log files, we use Spark for Big data computing.
format article
author Péter Marjai
Péter Lehotay-Kéry
Attila Kiss
author_facet Péter Marjai
Péter Lehotay-Kéry
Attila Kiss
author_sort Péter Marjai
title Document similarity for error prediction
title_short Document similarity for error prediction
title_full Document similarity for error prediction
title_fullStr Document similarity for error prediction
title_full_unstemmed Document similarity for error prediction
title_sort document similarity for error prediction
publisher Taylor & Francis Group
publishDate 2021
url https://doaj.org/article/fb0325fcab3947a4ac4a7c7934c0c359
work_keys_str_mv AT petermarjai documentsimilarityforerrorprediction
AT peterlehotaykery documentsimilarityforerrorprediction
AT attilakiss documentsimilarityforerrorprediction
_version_ 1718425457213308928