Formulation of the Tree Approximation Problem as a Detection Problem and Relation between the AUC and Information Theory Divergences
Autor: | Navid Tafaghodi Khajavi, Anthony Kuh |
---|---|
Rok vydání: | 2015 |
Předmět: |
Big Data
Mathematical optimization AUC Kullback–Leibler divergence Computational complexity theory Computer science Inference Detection problem 02 engineering and technology Information theory Belief propagation 01 natural sciences 010104 statistics & probability 0202 electrical engineering electronic engineering information engineering Applied mathematics Graphical model Smart Grid 0101 mathematics Divergence (statistics) General Environmental Science Approximation algorithm 020206 networking & telecommunications Graph Tree (data structure) Tree structure Distributed algorithm Tree approximation General Earth and Planetary Sciences |
Zdroj: | INNS Conference on Big Data |
ISSN: | 1877-0509 |
DOI: | 10.1016/j.procs.2015.07.302 |
Popis: | This paper looks at graphical models and discusses the quality of tree approximations by ex- amining information measures and formulating the problem as a detection problem. One of the widely used algorithms for tree-structured approximation and modeling is the Chow-Liu algo- rithm. While this algorithm is optimal for Gaussian distributions in the sense of the Kullback- Leibler (KL) divergence, it is not optimal when compared with other information divergences and criteria such as Area Under the detection Curve (AUC) and reverse KL divergence. In this paper, we discuss the optimality of tree approximation methods. We show that different information theory divergences and criteria such as the KL divergence, the Jeffreys divergence and the AUC are all related of the correlation approximation matrix (CAM), Δ. We also show some explicit relations between these different information divergences and criteria and investigate the relation between quality of the tree approximation by formulating a detection problem and considering the AUC and the Jefferys divergence which is a distance between two conditional means, as alternative approaches for the tree approximation. The tree structure approximation algorithms have interesting applications. In general, the tree structured enables us to do distributed algorithms such as belief propagation and also to do inference. Because of computational complexity, it is important to consider simpler graphical models such as trees when modeling systems for many applications. We previously discuss the problem of convergence of the distributed state estimation algorithm for electric distribution grids, “mi- crogrids,” with distributed renewable energy generation. The correlations between distributed renewable energy generators was approximated by a simpler tree-structured graphical model. This paper carries the research further by looking at real spatial solar irradiation data and approximating correlation matrices by a tree structured graph. We also conduct simulations on synthetic data with larger number of nodes. |
Databáze: | OpenAIRE |
Externí odkaz: |