Automatic Taxonomy Classification by Pretrained Language Model

In recent years, automatic ontology generation has received significant attention in information science as a means of systemizing vast amounts of online data. As our initial attempt of ontology generation with a neural network, we proposed a recurrent neural network-based method. However, updating...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ayato Kuwana, Atsushi Oba, Ranto Sawai, Incheon Paik
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/21573847536445778b0375f914c28735
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:21573847536445778b0375f914c28735
record_format dspace
spelling oai:doaj.org-article:21573847536445778b0375f914c287352021-11-11T15:39:25ZAutomatic Taxonomy Classification by Pretrained Language Model10.3390/electronics102126562079-9292https://doaj.org/article/21573847536445778b0375f914c287352021-10-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/21/2656https://doaj.org/toc/2079-9292In recent years, automatic ontology generation has received significant attention in information science as a means of systemizing vast amounts of online data. As our initial attempt of ontology generation with a neural network, we proposed a recurrent neural network-based method. However, updating the architecture is possible because of the development in natural language processing (NLP). By contrast, the transfer learning of language models trained by a large, unlabeled corpus has yielded a breakthrough in NLP. Inspired by these achievements, we propose a novel workflow for ontology generation comprising two-stage learning. Our results showed that our best method improved accuracy by over 12.5%. As an application example, we applied our model to the Stanford Question Answering Dataset to show ontology generation in a real field. The results showed that our model can generate a good ontology, with some exceptions in the real field, indicating future research directions to improve the quality.Ayato KuwanaAtsushi ObaRanto SawaiIncheon PaikMDPI AGarticleontologyautomationnatural language processing (NLP)pretrained modelElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2656, p 2656 (2021)
institution DOAJ
collection DOAJ
language EN
topic ontology
automation
natural language processing (NLP)
pretrained model
Electronics
TK7800-8360
spellingShingle ontology
automation
natural language processing (NLP)
pretrained model
Electronics
TK7800-8360
Ayato Kuwana
Atsushi Oba
Ranto Sawai
Incheon Paik
Automatic Taxonomy Classification by Pretrained Language Model
description In recent years, automatic ontology generation has received significant attention in information science as a means of systemizing vast amounts of online data. As our initial attempt of ontology generation with a neural network, we proposed a recurrent neural network-based method. However, updating the architecture is possible because of the development in natural language processing (NLP). By contrast, the transfer learning of language models trained by a large, unlabeled corpus has yielded a breakthrough in NLP. Inspired by these achievements, we propose a novel workflow for ontology generation comprising two-stage learning. Our results showed that our best method improved accuracy by over 12.5%. As an application example, we applied our model to the Stanford Question Answering Dataset to show ontology generation in a real field. The results showed that our model can generate a good ontology, with some exceptions in the real field, indicating future research directions to improve the quality.
format article
author Ayato Kuwana
Atsushi Oba
Ranto Sawai
Incheon Paik
author_facet Ayato Kuwana
Atsushi Oba
Ranto Sawai
Incheon Paik
author_sort Ayato Kuwana
title Automatic Taxonomy Classification by Pretrained Language Model
title_short Automatic Taxonomy Classification by Pretrained Language Model
title_full Automatic Taxonomy Classification by Pretrained Language Model
title_fullStr Automatic Taxonomy Classification by Pretrained Language Model
title_full_unstemmed Automatic Taxonomy Classification by Pretrained Language Model
title_sort automatic taxonomy classification by pretrained language model
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/21573847536445778b0375f914c28735
work_keys_str_mv AT ayatokuwana automatictaxonomyclassificationbypretrainedlanguagemodel
AT atsushioba automatictaxonomyclassificationbypretrainedlanguagemodel
AT rantosawai automatictaxonomyclassificationbypretrainedlanguagemodel
AT incheonpaik automatictaxonomyclassificationbypretrainedlanguagemodel
_version_ 1718434654326882304