GNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning
Abstract A variety of detailed data about geological topics and geoscience knowledge are buried in the geoscience literature and rarely used. Named entity recognition (NER) provides both opportunities and challenges to leverage this wealth of data in the geoscience literature for data analysis and f...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
American Geophysical Union (AGU)
2019
|
Materias: | |
Acceso en línea: | https://doaj.org/article/8b19d8b977604258a40f352ff20ac254 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:8b19d8b977604258a40f352ff20ac254 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:8b19d8b977604258a40f352ff20ac2542021-11-30T22:55:32ZGNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning2333-508410.1029/2019EA000610https://doaj.org/article/8b19d8b977604258a40f352ff20ac2542019-06-01T00:00:00Zhttps://doi.org/10.1029/2019EA000610https://doaj.org/toc/2333-5084Abstract A variety of detailed data about geological topics and geoscience knowledge are buried in the geoscience literature and rarely used. Named entity recognition (NER) provides both opportunities and challenges to leverage this wealth of data in the geoscience literature for data analysis and further information extraction. Existing NER models and techniques are mainly based on rule‐based and supervised approaches, and developing such systems requires a costly manual effort. In this paper, we first design a generic stepwise framework for domain‐specific NER. Following this framework, domain‐specific entities and domain‐general words are collected and selected as seed terms. Normalization and grouping processes are then applied to these seed terms for further analysis. A random extraction algorithm based on a unigram language model is used to generate a large‐scale training data set consisting of probabilistically labeled pseudosentences. Each generated sentence is then used as input to the self‐training and learning algorithm. Experimental results on two constructed data sets demonstrate that the proposed model effectively recognizes and identifies geological named entities.Qinjun QiuZhong XieLiang WuLiufeng TaoAmerican Geophysical Union (AGU)articlenatural language processingnamed entity recognitiongeoscience domainunsupervised learningAstronomyQB1-991GeologyQE1-996.5ENEarth and Space Science, Vol 6, Iss 6, Pp 931-946 (2019) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
natural language processing named entity recognition geoscience domain unsupervised learning Astronomy QB1-991 Geology QE1-996.5 |
spellingShingle |
natural language processing named entity recognition geoscience domain unsupervised learning Astronomy QB1-991 Geology QE1-996.5 Qinjun Qiu Zhong Xie Liang Wu Liufeng Tao GNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning |
description |
Abstract A variety of detailed data about geological topics and geoscience knowledge are buried in the geoscience literature and rarely used. Named entity recognition (NER) provides both opportunities and challenges to leverage this wealth of data in the geoscience literature for data analysis and further information extraction. Existing NER models and techniques are mainly based on rule‐based and supervised approaches, and developing such systems requires a costly manual effort. In this paper, we first design a generic stepwise framework for domain‐specific NER. Following this framework, domain‐specific entities and domain‐general words are collected and selected as seed terms. Normalization and grouping processes are then applied to these seed terms for further analysis. A random extraction algorithm based on a unigram language model is used to generate a large‐scale training data set consisting of probabilistically labeled pseudosentences. Each generated sentence is then used as input to the self‐training and learning algorithm. Experimental results on two constructed data sets demonstrate that the proposed model effectively recognizes and identifies geological named entities. |
format |
article |
author |
Qinjun Qiu Zhong Xie Liang Wu Liufeng Tao |
author_facet |
Qinjun Qiu Zhong Xie Liang Wu Liufeng Tao |
author_sort |
Qinjun Qiu |
title |
GNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning |
title_short |
GNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning |
title_full |
GNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning |
title_fullStr |
GNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning |
title_full_unstemmed |
GNER: A Generative Model for Geological Named Entity Recognition Without Labeled Data Using Deep Learning |
title_sort |
gner: a generative model for geological named entity recognition without labeled data using deep learning |
publisher |
American Geophysical Union (AGU) |
publishDate |
2019 |
url |
https://doaj.org/article/8b19d8b977604258a40f352ff20ac254 |
work_keys_str_mv |
AT qinjunqiu gneragenerativemodelforgeologicalnamedentityrecognitionwithoutlabeleddatausingdeeplearning AT zhongxie gneragenerativemodelforgeologicalnamedentityrecognitionwithoutlabeleddatausingdeeplearning AT liangwu gneragenerativemodelforgeologicalnamedentityrecognitionwithoutlabeleddatausingdeeplearning AT liufengtao gneragenerativemodelforgeologicalnamedentityrecognitionwithoutlabeleddatausingdeeplearning |
_version_ |
1718406214418694144 |