Comparison and Evaluation of Different Methods for the Feature Extraction from Educational Contents

Autor:	Jose Aguilar, Camilo Salazar, Henry Velasco, Julian Monsalve-Pulido, Edwin Montoya
Jazyk:	angličtina
Rok vydání:	2020
Předmět:	feature extraction content analysis educational contents semantic representation information retrieval recommendation system Electronic computers. Computer science QA75.5-76.95
Zdroj:	Computation, Vol 8, Iss 2, p 30 (2020)
Druh dokumentu:	article
ISSN:	2079-3197
DOI:	10.3390/computation8020030
Popis:	This paper analyses the capabilities of different techniques to build a semantic representation of educational digital resources. Educational digital resources are modeled using the Learning Object Metadata (LOM) standard, and these semantic representations can be obtained from different LOM fields, like the title, description, among others, in order to extract the features/characteristics from the digital resources. The feature extraction methods used in this paper are the Best Matching 25 (BM25), the Latent Semantic Analysis (LSA), Doc2Vec, and the Latent Dirichlet allocation (LDA). The utilization of the features/descriptors generated by them are tested in three types of educational digital resources (scientific publications, learning objects, patents), a paraphrase corpus and two use cases: in an information retrieval context and in an educational recommendation system. For this analysis are used unsupervised metrics to determine the feature quality proposed by each one, which are two similarity functions and the entropy. In addition, the paper presents tests of the techniques for the classification of paraphrases. The experiments show that according to the type of content and metric, the performance of the feature extraction methods is very different; in some cases are better than the others, and in other cases is the inverse.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/dcc7199d9ab2475db7b4716925498ef8 Zobrazit plný text záznamu View record in DOAJ