Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century

We consider in this work linguostatistical methods that were used for attribution (establishing authorship) of publicistic articles of the XIX century. At that time, F. M. Dostoevsky edited and headed three journals: ""Time"", ""Epoch"" and ""Citizen...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Aleksandr Rogov, Nikolai Moskin, Kirill Kulakov, Roman Abramov
Formato: article
Lenguaje:EN
Publicado: FRUCT 2021
Materias:
Acceso en línea:https://doaj.org/article/df087c5bfdd8451da9e6aa5851c9594a
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:We consider in this work linguostatistical methods that were used for attribution (establishing authorship) of publicistic articles of the XIX century. At that time, F. M. Dostoevsky edited and headed three journals: ""Time"", ""Epoch"" and ""Citizen"", where there are about 500 unattributed texts. Samples from texts were compiled, their characteristics were studied, and a comparative analysis of the classification results based on various machine learning methods (decision trees, recurrent networks, parallel recurrent networks, transformer model) was carried out. The input of texts, their processing and the calculation of linguostatistical parameters were carried out using an updated version of the SMALT information system.