When External Knowledge Does Not Aggregate in Named Entity Recognition

Nenhuma Miniatura disponível

Data

2021-01-01

Autores

Privatto, Pedro Ivo Monteiro [UNESP]
Guilherme, Ivan Rizzo [UNESP]

Título da Revista

ISSN da Revista

Título de Volume

Editor

Resumo

In the different areas of knowledge, textual data are important sources of information. This way, Information Extraction methods have been developed to identify and structure information present in textual documents. In particular there is the Named Entity Recognition (NER) task, which consists of using methods to identify Named Entities, such as Person, Place, among others, in texts, using techniques from Natural Language Processing and Machine Learning. Recent works explored the use of external sources of knowledge to boost the Machine Learning models with sets of domain specific relevant information for the NER task. This work aims to evaluate the aggregation of external knowledge, in the form of Gazetter and Knowledge Graphs, for NER task. Our approach is composed of two steps: i) generation of embeddings, ii) definition and training of the Machine Learning methods. The experiments were conducted on four English datasets, and their results show that the applied strategies for external knowledge integration did not bring great gains to the models, as expressed by F1-Score metric. In the performed experiments, there was an F1-score increase in 17 of the 32 cases where external knowledge was used, but in most cases the gains were lesser than 0.5% in F1-score. In some scenarios the aggregated external knowledge does not capture relevant content, thus not being necessarily beneficial to the methodology.

Descrição

Palavras-chave

Information extraction, Knowledge embeddings, Named entity recognition

Como citar

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 13074 LNAI, p. 616-627.