i-Dataquest: A heterogeneous information retrieval tool using data graph for the manufacturing industry

Autor: Philippe Veron, Frédéric Segonds, Esma Yahia, Lise Kim, Antoine Mallet
Přispěvatelé: Laboratoire d’Ingénierie des Systèmes Physiques et Numériques (LISPEN), Arts et Métiers Sciences et Technologies, HESAM Université (HESAM)-HESAM Université (HESAM), Laboratoire Conception de Produits et Innovation (LCPI), Capgemini [Toulouse], Capgemini
Jazyk: angličtina
Rok vydání: 2023
Předmět:
Zdroj: Computers in Industry
Computers in Industry, Elsevier, 2021, 132, pp.103527. ⟨10.1016/j.compind.2021.103527⟩
ISSN: 0166-3615
Popis: International audience; Manufacturing industry needs access to the data in order to realise its activities but also to generate new value-added knowledge. Nevertheless, it is confronted with a large and growing volume of heterogeneous data which limits its ability to exploit them optimally. Moreover, the data are distributed within different heterogeneous information systems, which limits the relationship exploration under the information retrieval process. Usually, the challenge is addressed by trying to manage and normalize the data structure in order to faster searching and exploiting them in a manufacturing context. For their part, the authors present i-Dataquest, an information retrieval system supported by (i) a graph-oriented model built from the structured and unstructured data of the company and (ii) a query system answering ‘what’ and ‘about what’ and (iii) generating three different results: a list of items, a list of property values and a list of sentences. The i-Dataquest prototype is built using Neo4J for the graph system generation, ConceptNet for lexical resource management and StandfordNLP for natural language processing. An evaluation of the prototype’s performance is conducted through a data set representing a drone manufacturer. The results show that the transformation of specific content such as tables in the graph and the semantic expansion of queries significantly improves the recall and precision measures. The results also suggest improving filtering less relevant results by considering particularly queries looking for a specific value.
Databáze: OpenAIRE