A Scientific Document Retrieval and Reordering Method by Incorporating HFS and LSD

Autor: Ziyang Feng, Xuedong Tian
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Applied Sciences, Vol 13, Iss 20, p 11207 (2023)
Druh dokumentu: article
ISSN: 2076-3417
DOI: 10.3390/app132011207
Popis: Achieving scientific document retrieval by considering the wealth of mathematical expressions and the semantic text they contain has become an inescapable trend. Current scientific document matching models focus solely on the textual features of expressions and frequently encounter hurdles like proliferative parameters and sluggish reasoning speeds in the pursuit of improved performance. To solve this problem, this paper proposes a scientific document retrieval method founded upon hesitant fuzzy sets (HFS) and local semantic distillation (LSD). Concretely, in order to extract both spatial and semantic features for each symbol within a mathematical expression, this paper introduces an expression analysis module that leverages HFS to establish feature indices. Secondly, to enhance contextual semantic alignment, the method of knowledge distillation is employed to refine the pretrained language model and establish a twin network for semantic matching. Lastly, by amalgamating mathematical expressions with contextual semantic features, the retrieval results can be made more efficient and rational. Experiments were implemented on the NTCIR dataset and the expanded Chinese dataset. The average MAP for mathematical expression retrieval results was 83.0%, and the average nDCG for sorting scientific documents was 85.8%.
Databáze: Directory of Open Access Journals