Korean Historical Documents Analysis with Improved Dynamic Word Embedding

Autor: JeongA Wi, Kyohoon Jin, Kyeongpil Kang, Young-bin Kim
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Word embedding
010504 meteorology & atmospheric sciences
Machine translation
Computer science
named entity recognition
computer.software_genre
01 natural sciences
deep-learning
lcsh:Technology
Task (project management)
lcsh:Chemistry
Named-entity recognition
0103 physical sciences
General Materials Science
010303 astronomy & astrophysics
Instrumentation
lcsh:QH301-705.5
0105 earth and related environmental sciences
Fluid Flow and Transfer Processes
dynamic word embedding
business.industry
lcsh:T
Process Chemistry and Technology
Deep learning
General Engineering
historical documents
Thesaurus
neural machine translation
lcsh:QC1-999
Computer Science Applications
Annals
lcsh:Biology (General)
lcsh:QD1-999
lcsh:TA1-2040
Key (cryptography)
transformer
Artificial intelligence
business
lcsh:Engineering (General). Civil engineering (General)
computer
Natural language processing
lcsh:Physics
Zdroj: Applied Sciences, Vol 10, Iss 7939, p 7939 (2020)
Applied Sciences
Volume 10
Issue 21
ISSN: 2076-3417
Popis: Historical documents refer to records or books that provide textual information about the thoughts and consciousness of past civilisations, and therefore, they have historical significance. These documents are used as key sources for historical studies as they provide information over several historical periods. Many studies have analysed various historical documents using deep learning
however, studies that employ changes in information over time are lacking. In this study, we propose a deep-learning approach using improved dynamic word embedding to determine the characteristics of 27 kings mentioned in the Annals of the Joseon Dynasty, which contains a record of 500 years. The characteristics of words for each king were quantitated based on dynamic word embedding
further, this information was applied to named entity recognition and neural machine translation.In experiments, we confirmed that the method we proposed showed better performance than other methods. In the named entity recognition task, the F1-score was 0.68
in the neural machine translation task, the BLEU4 score was 0.34. We demonstrated that this approach can be used to extract information about diplomatic relationships with neighbouring countries and the economic conditions of the Joseon Dynasty.
Databáze: OpenAIRE