Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century

We consider in this work linguostatistical methods that were used for attribution (establishing authorship) of publicistic articles of the XIX century. At that time, F. M. Dostoevsky edited and headed three journals: ""Time"", ""Epoch"" and ""Citizen...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Aleksandr Rogov, Nikolai Moskin, Kirill Kulakov, Roman Abramov
Formato: article
Lenguaje:EN
Publicado: FRUCT 2021
Materias:
Acceso en línea:https://doaj.org/article/df087c5bfdd8451da9e6aa5851c9594a
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:df087c5bfdd8451da9e6aa5851c9594a
record_format dspace
spelling oai:doaj.org-article:df087c5bfdd8451da9e6aa5851c9594a2021-11-20T15:59:33ZMachine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century2305-72542343-073710.23919/FRUCT53335.2021.9599961https://doaj.org/article/df087c5bfdd8451da9e6aa5851c9594a2021-10-01T00:00:00Zhttps://www.fruct.org/publications/fruct30/files/Rog.pdfhttps://doaj.org/toc/2305-7254https://doaj.org/toc/2343-0737We consider in this work linguostatistical methods that were used for attribution (establishing authorship) of publicistic articles of the XIX century. At that time, F. M. Dostoevsky edited and headed three journals: ""Time"", ""Epoch"" and ""Citizen"", where there are about 500 unattributed texts. Samples from texts were compiled, their characteristics were studied, and a comparative analysis of the classification results based on various machine learning methods (decision trees, recurrent networks, parallel recurrent networks, transformer model) was carried out. The input of texts, their processing and the calculation of linguostatistical parameters were carried out using an updated version of the SMALT information system.Aleksandr RogovNikolai MoskinKirill KulakovRoman AbramovFRUCTarticletext attributionf. m. dostoevskymachine learningdecision treerecurrent networktransformer modelsmalt information systemTelecommunicationTK5101-6720ENProceedings of the XXth Conference of Open Innovations Association FRUCT, Vol 30, Iss 1, Pp 223-229 (2021)
institution DOAJ
collection DOAJ
language EN
topic text attribution
f. m. dostoevsky
machine learning
decision tree
recurrent network
transformer model
smalt information system
Telecommunication
TK5101-6720
spellingShingle text attribution
f. m. dostoevsky
machine learning
decision tree
recurrent network
transformer model
smalt information system
Telecommunication
TK5101-6720
Aleksandr Rogov
Nikolai Moskin
Kirill Kulakov
Roman Abramov
Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century
description We consider in this work linguostatistical methods that were used for attribution (establishing authorship) of publicistic articles of the XIX century. At that time, F. M. Dostoevsky edited and headed three journals: ""Time"", ""Epoch"" and ""Citizen"", where there are about 500 unattributed texts. Samples from texts were compiled, their characteristics were studied, and a comparative analysis of the classification results based on various machine learning methods (decision trees, recurrent networks, parallel recurrent networks, transformer model) was carried out. The input of texts, their processing and the calculation of linguostatistical parameters were carried out using an updated version of the SMALT information system.
format article
author Aleksandr Rogov
Nikolai Moskin
Kirill Kulakov
Roman Abramov
author_facet Aleksandr Rogov
Nikolai Moskin
Kirill Kulakov
Roman Abramov
author_sort Aleksandr Rogov
title Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century
title_short Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century
title_full Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century
title_fullStr Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century
title_full_unstemmed Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century
title_sort machine learning methods in the problem of attribution of publicistic texts of the xix century
publisher FRUCT
publishDate 2021
url https://doaj.org/article/df087c5bfdd8451da9e6aa5851c9594a
work_keys_str_mv AT aleksandrrogov machinelearningmethodsintheproblemofattributionofpublicistictextsofthexixcentury
AT nikolaimoskin machinelearningmethodsintheproblemofattributionofpublicistictextsofthexixcentury
AT kirillkulakov machinelearningmethodsintheproblemofattributionofpublicistictextsofthexixcentury
AT romanabramov machinelearningmethodsintheproblemofattributionofpublicistictextsofthexixcentury
_version_ 1718419455189450752