Community Recovery in Hypergraphs

Autor: Kwangjun Ahn, Kangwook Lee, Changho Suh
Rok vydání: 2019
Předmět:
FOS: Computer and information sciences
Theoretical computer science
Computer science
Computer Science - Information Theory
Machine Learning (stat.ML)
02 engineering and technology
010501 environmental sciences
Library and Information Sciences
01 natural sciences
Machine Learning (cs.LG)
Bernoulli's principle
Statistics - Machine Learning
0202 electrical engineering
electronic engineering
information engineering

Segmentation
Limit (mathematics)
Cluster analysis
Scaling
0105 earth and related environmental sciences
Social network
business.industry
Information Theory (cs.IT)
Homogeneity (statistics)
020206 networking & telecommunications
Image segmentation
Computer Science Applications
Computer Science - Learning
Data point
Core (graph theory)
Noise (video)
business
Community recovery
MathematicsofComputing_DISCRETEMATHEMATICS
Information Systems
Zdroj: Allerton
ISSN: 1557-9654
0018-9448
Popis: Community recovery is a central problem that arises in a wide variety of applications such as network clustering, motion segmentation, face clustering and protein complex detection. The objective of the problem is to cluster data points into distinct communities based on a set of measurements, each of which is associated with the values of a certain number of data points. While most of the prior works focus on a setting in which the number of data points involved in a measurement is two, this work explores a generalized setting in which the number can be more than two. Motivated by applications particularly in machine learning and channel coding, we consider two types of measurements: (1) homogeneity measurement which indicates whether or not the associated data points belong to the same community; (2) parity measurement which denotes the modulo-2 sum of the values of the data points. Such measurements are possibly corrupted by Bernoulli noise. We characterize the fundamental limits on the number of measurements required to reconstruct the communities for the considered models.
25 pages, 7 figures. Submitted to IEEE Transacations on Information Theory
Databáze: OpenAIRE