Popis: |
We compute and evaluate relevance scores for knowledge-base triples from type-like relations. Such a score measures the degree to which an entity "belongs" to a type. For example, Quentin Tarantino has various professions, including Film Director, Screenwriter, and Actor. The first two would get a high score in our setting, because those are his main professions. The third would get a low score, because he mostly had cameo appearances in his own movies. Such scores are essential in the ranking for entity queries, e.g. "American actors" or "Quentin Tarantino professions". These scores are different from scores for "correctness" or "accuracy" (all three professions above are correct and accurate). We propose a variety of algorithms to compute these scores. For our evaluation we designed a new benchmark, which includes a ground truth based on about 14K human judgments obtained via crowdsourcing. Inter-judge agreement is slightly over 90%. Existing approaches from the literature give results far from the optimum. Our best algorithms achieve an agreement of about 80% with the ground truth. |