i-Dataquest: A heterogeneous information retrieval tool using data graph for the manufacturing industry
Autor: | Philippe Veron, Frédéric Segonds, Esma Yahia, Lise Kim, Antoine Mallet |
---|---|
Přispěvatelé: | Laboratoire d’Ingénierie des Systèmes Physiques et Numériques (LISPEN), Arts et Métiers Sciences et Technologies, HESAM Université (HESAM)-HESAM Université (HESAM), Laboratoire Conception de Produits et Innovation (LCPI), Capgemini [Toulouse], Capgemini |
Jazyk: | angličtina |
Rok vydání: | 2023 |
Předmět: |
Base de données [Informatique]
General Computer Science Exploit Process (engineering) Computer science Context (language use) 02 engineering and technology 03 medical and health sciences Manufacturing Industry 0202 electrical engineering electronic engineering information engineering [INFO]Computer Science [cs] 030304 developmental biology 0303 health sciences Information retrieval [INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] Graph Database System Generation General Engineering Query System 020207 software engineering Unstructured data Manufacturing Data Informatique Data structure [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] Information Retrieval Graph (abstract data type) Recherche d'information [Informatique] Precision and recall |
Zdroj: | Computers in Industry Computers in Industry, Elsevier, 2021, 132, pp.103527. ⟨10.1016/j.compind.2021.103527⟩ |
ISSN: | 0166-3615 |
Popis: | International audience; Manufacturing industry needs access to the data in order to realise its activities but also to generate new value-added knowledge. Nevertheless, it is confronted with a large and growing volume of heterogeneous data which limits its ability to exploit them optimally. Moreover, the data are distributed within different heterogeneous information systems, which limits the relationship exploration under the information retrieval process. Usually, the challenge is addressed by trying to manage and normalize the data structure in order to faster searching and exploiting them in a manufacturing context. For their part, the authors present i-Dataquest, an information retrieval system supported by (i) a graph-oriented model built from the structured and unstructured data of the company and (ii) a query system answering ‘what’ and ‘about what’ and (iii) generating three different results: a list of items, a list of property values and a list of sentences. The i-Dataquest prototype is built using Neo4J for the graph system generation, ConceptNet for lexical resource management and StandfordNLP for natural language processing. An evaluation of the prototype’s performance is conducted through a data set representing a drone manufacturer. The results show that the transformation of specific content such as tables in the graph and the semantic expansion of queries significantly improves the recall and precision measures. The results also suggest improving filtering less relevant results by considering particularly queries looking for a specific value. |
Databáze: | OpenAIRE |
Externí odkaz: |