Formulation of the Tree Approximation Problem as a Detection Problem and Relation between the AUC and Information Theory Divergences

Autor: Navid Tafaghodi Khajavi, Anthony Kuh
Rok vydání: 2015
Předmět:
Zdroj: INNS Conference on Big Data
ISSN: 1877-0509
DOI: 10.1016/j.procs.2015.07.302
Popis: This paper looks at graphical models and discusses the quality of tree approximations by ex- amining information measures and formulating the problem as a detection problem. One of the widely used algorithms for tree-structured approximation and modeling is the Chow-Liu algo- rithm. While this algorithm is optimal for Gaussian distributions in the sense of the Kullback- Leibler (KL) divergence, it is not optimal when compared with other information divergences and criteria such as Area Under the detection Curve (AUC) and reverse KL divergence. In this paper, we discuss the optimality of tree approximation methods. We show that different information theory divergences and criteria such as the KL divergence, the Jeffreys divergence and the AUC are all related of the correlation approximation matrix (CAM), Δ. We also show some explicit relations between these different information divergences and criteria and investigate the relation between quality of the tree approximation by formulating a detection problem and considering the AUC and the Jefferys divergence which is a distance between two conditional means, as alternative approaches for the tree approximation. The tree structure approximation algorithms have interesting applications. In general, the tree structured enables us to do distributed algorithms such as belief propagation and also to do inference. Because of computational complexity, it is important to consider simpler graphical models such as trees when modeling systems for many applications. We previously discuss the problem of convergence of the distributed state estimation algorithm for electric distribution grids, “mi- crogrids,” with distributed renewable energy generation. The correlations between distributed renewable energy generators was approximated by a simpler tree-structured graphical model. This paper carries the research further by looking at real spatial solar irradiation data and approximating correlation matrices by a tree structured graph. We also conduct simulations on synthetic data with larger number of nodes.
Databáze: OpenAIRE