Community Recovery in Hypergraphs
Autor: | Kwangjun Ahn, Kangwook Lee, Changho Suh |
---|---|
Rok vydání: | 2019 |
Předmět: |
FOS: Computer and information sciences
Theoretical computer science Computer science Computer Science - Information Theory Machine Learning (stat.ML) 02 engineering and technology 010501 environmental sciences Library and Information Sciences 01 natural sciences Machine Learning (cs.LG) Bernoulli's principle Statistics - Machine Learning 0202 electrical engineering electronic engineering information engineering Segmentation Limit (mathematics) Cluster analysis Scaling 0105 earth and related environmental sciences Social network business.industry Information Theory (cs.IT) Homogeneity (statistics) 020206 networking & telecommunications Image segmentation Computer Science Applications Computer Science - Learning Data point Core (graph theory) Noise (video) business Community recovery MathematicsofComputing_DISCRETEMATHEMATICS Information Systems |
Zdroj: | Allerton |
ISSN: | 1557-9654 0018-9448 |
Popis: | Community recovery is a central problem that arises in a wide variety of applications such as network clustering, motion segmentation, face clustering and protein complex detection. The objective of the problem is to cluster data points into distinct communities based on a set of measurements, each of which is associated with the values of a certain number of data points. While most of the prior works focus on a setting in which the number of data points involved in a measurement is two, this work explores a generalized setting in which the number can be more than two. Motivated by applications particularly in machine learning and channel coding, we consider two types of measurements: (1) homogeneity measurement which indicates whether or not the associated data points belong to the same community; (2) parity measurement which denotes the modulo-2 sum of the values of the data points. Such measurements are possibly corrupted by Bernoulli noise. We characterize the fundamental limits on the number of measurements required to reconstruct the communities for the considered models. 25 pages, 7 figures. Submitted to IEEE Transacations on Information Theory |
Databáze: | OpenAIRE |
Externí odkaz: |