Popis: |
Declarative rules such as Prolog and Datalog rules are common formalisms to express expert knowledge and facts. They play an important role in Knowledge Graph (KG) construction and completion. Such rules not only encode the expert background knowledge and the relational patterns among the data, but also infer new knowledge and insights from them. Formalizing rules is often a laborious manual process, while learning them from data automatically can ease this process. Within the rule hypothesis space, current approaches resort to exhaustive search with a number of heuristics and syntactic restrictions on the rule language, which impacts the efficiency and quality of the outcome rules. In this paper, we extend the rule hypothesis space from usual path rules to general Datalog rule space by proposing a novel Genetic Logic Programming algorithm named Evoda. It is an iterative process to learn high-quality rules over large scale KG for a matter of seconds. We have performed experiments over multiple real-world KGs and various evaluation metrics to show its mining capabilities for higher quality rules and more precise predictions. Additionally, we have applied it on the KG completion tasks to illustrate its competitiveness with several state-of-the-art embedding or neural-based models. The experiments demonstrate the feasibility, effectiveness and efficiency of the Evoda algorithm. |