Machine Learning Methods in the Problem of Attribution of Publicistic Texts of the XIX Century
We consider in this work linguostatistical methods that were used for attribution (establishing authorship) of publicistic articles of the XIX century. At that time, F. M. Dostoevsky edited and headed three journals: ""Time"", ""Epoch"" and ""Citizen...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
FRUCT
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/df087c5bfdd8451da9e6aa5851c9594a |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Sumario: | We consider in this work linguostatistical methods that were used for attribution (establishing authorship) of publicistic articles of the XIX century. At that time, F. M. Dostoevsky edited and headed three journals: ""Time"", ""Epoch"" and ""Citizen"", where there are about 500 unattributed texts. Samples from texts were compiled, their characteristics were studied, and a comparative analysis of the classification results based on various machine learning methods (decision trees, recurrent networks, parallel recurrent networks, transformer model) was carried out. The input of texts, their processing and the calculation of linguostatistical parameters were carried out using an updated version of the SMALT information system. |
---|