A Data Driven Approach for Raw Material Terminology

Autor: Ivan Babić, Ranka Stanković, Ljiljana Kolonja, Mihailo Škorić, Aleksandra Tomašević, Olivera Kitanović
Jazyk: angličtina
Rok vydání: 2021
Předmět:
Computer science
leksički podaci
02 engineering and technology
mining
Collocation extraction
Lexicon
computer.software_genre
otvoreni povezani podaci
01 natural sciences
lcsh:Technology
law.invention
Terminology
lcsh:Chemistry
Resource (project management)
law
terminološka aplikacija
terminology
0202 electrical engineering
electronic engineering
information engineering

General Materials Science
Instrumentation
lcsh:QH301-705.5
Digitization
Fluid Flow and Transfer Processes
General Engineering
rudarstvo
lcsh:QC1-999
Computer Science Applications
korpusi
raw material
rečnik
020201 artificial intelligence & image processing
Hypertext
Natural language processing
linguistic linked open data
Domain (software engineering)
mobilna aplikacija
0101 mathematics
Structure (mathematical logic)
business.industry
sirovine
digitizacija
lexical data
lcsh:T
Process Chemistry and Technology
010102 general mathematics
mobile application
lcsh:Biology (General)
lcsh:QD1-999
lcsh:TA1-2040
digitization
terminology application
Artificial intelligence
business
lcsh:Engineering (General). Civil engineering (General)
computer
corpus data
lcsh:Physics
terminologija
dictionary
Zdroj: Applied Sciences, Vol 11, Iss 2892, p 2892 (2021)
Applied Sciences
Volume 11
Issue 7
ISSN: 2076-3417
Popis: The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has been generated and a mobile application for its use. Available (terminological) resources will be presented—paper dictionaries and digital resources related to the raw material domain, as well as general lexica morphological dictionaries. Resource preparation started with dictionary (retro)digitisation and corpora enlargement, followed by adding new Serbian terms to general lexica dictionaries, as well as adding bilingual terms. Dictionary development is relying on corpus analysis, details of which are also presented. Usage examples, collocations and concordances play an important role in raw material terminology, and have also been included in this research. Some important related issues discussed are collocation extraction methods, the use of domain labels, lexical and semantic relations, definitions and subentries.
Databáze: OpenAIRE