A Data Driven Approach for Raw Material Terminology
Autor: | Ivan Babić, Ranka Stanković, Ljiljana Kolonja, Mihailo Škorić, Aleksandra Tomašević, Olivera Kitanović |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Computer science
leksički podaci 02 engineering and technology mining Collocation extraction Lexicon computer.software_genre otvoreni povezani podaci 01 natural sciences lcsh:Technology law.invention Terminology lcsh:Chemistry Resource (project management) law terminološka aplikacija terminology 0202 electrical engineering electronic engineering information engineering General Materials Science Instrumentation lcsh:QH301-705.5 Digitization Fluid Flow and Transfer Processes General Engineering rudarstvo lcsh:QC1-999 Computer Science Applications korpusi raw material rečnik 020201 artificial intelligence & image processing Hypertext Natural language processing linguistic linked open data Domain (software engineering) mobilna aplikacija 0101 mathematics Structure (mathematical logic) business.industry sirovine digitizacija lexical data lcsh:T Process Chemistry and Technology 010102 general mathematics mobile application lcsh:Biology (General) lcsh:QD1-999 lcsh:TA1-2040 digitization terminology application Artificial intelligence business lcsh:Engineering (General). Civil engineering (General) computer corpus data lcsh:Physics terminologija dictionary |
Zdroj: | Applied Sciences, Vol 11, Iss 2892, p 2892 (2021) Applied Sciences Volume 11 Issue 7 |
ISSN: | 2076-3417 |
Popis: | The research presented in this paper aims at creating a bilingual (sr-en), easily searchable, hypertext, born-digital, corpus-based terminological database of raw material terminology for dictionary production. The approach is based on linking dictionaries related to the raw material domain, both digitally born and printed, into a lexicon structure, aligning terminology from different dictionaries as much as possible. This paper presents the main features of this approach, data used for compilation of the terminological database, the procedure by which it has been generated and a mobile application for its use. Available (terminological) resources will be presented—paper dictionaries and digital resources related to the raw material domain, as well as general lexica morphological dictionaries. Resource preparation started with dictionary (retro)digitisation and corpora enlargement, followed by adding new Serbian terms to general lexica dictionaries, as well as adding bilingual terms. Dictionary development is relying on corpus analysis, details of which are also presented. Usage examples, collocations and concordances play an important role in raw material terminology, and have also been included in this research. Some important related issues discussed are collocation extraction methods, the use of domain labels, lexical and semantic relations, definitions and subentries. |
Databáze: | OpenAIRE |
Externí odkaz: |