On approximation measures for functional dependencies

Autor: Chris Giannella, Edward L. Robertson
Rok vydání: 2004
Předmět:
Zdroj: Information Systems. 29:483-507
ISSN: 0306-4379
Popis: We examine the issue of how to measure the degree to which a functional dependency (FD) is approximate. The primary motivation lies in the fact that approximate FDs represent potentially interesting patterns existent in a table. Their discovery is a valuable data mining problem. However, before algorithms can be developed, a measure must be defined quantifying their approximation degree.First we develop an approximation measure by axiomatizing the following intuition: the degree to which X → Y is approximate in a table T is the degree to which T determines a function from ΠX(T) to ΠY(T). We prove that a unique unnormalized measure satisfies these axioms up to a multiplicative constant. Next we compare the measure developed with two other measures from the literature. In all but one case, we show that the measures can be made to differ as much as possible within normalization. We examine these measure on several real datasets and observe that many of the theoretically possible extreme differences do not bear themselves out. We offer some conclusions as to particular situations where certain measures are more appropriate than others.
Databáze: OpenAIRE