Estimation of dense stochastic block models visited by random walks

Autor: Tran, Viet-Chi, Vo Thi Phuong, Thuy
Přispěvatelé: Laboratoire d'Analyse et de Mathématiques Appliquées (LAMA), Université Paris-Est Marne-la-Vallée (UPEM)-Fédération de Recherche Bézout-Université Paris-Est Créteil Val-de-Marne - Paris 12 (UPEC UP12)-Centre National de la Recherche Scientifique (CNRS), Laboratoire Analyse, Géométrie et Applications (LAGA), Université Paris 8 Vincennes-Saint-Denis (UP8)-Université Paris 13 (UP13)-Institut Galilée-Centre National de la Recherche Scientifique (CNRS), GdR GeoSto 3477Chaire 'Modélisation Mathématique et Biodiversité' of Veolia Environnement-Ecole Polytechnique-Museum National d'Histoire Naturelle-Fondation X, ANR-18-CE02-0010,EcoNet,Modèles statistiques avancés pour les réseaux écologiques(2018), ANR-10-LABX-0058,Bézout,Models and algorithms: from the discrete to the continuous(2010), Centre National de la Recherche Scientifique (CNRS)-Université Paris-Est Créteil Val-de-Marne - Paris 12 (UPEC UP12)-Fédération de Recherche Bézout-Université Paris-Est Marne-la-Vallée (UPEM), Université Paris 8 Vincennes-Saint-Denis (UP8)-Centre National de la Recherche Scientifique (CNRS)-Institut Galilée-Université Paris 13 (UP13)
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Electronic Journal of Statistics
Electronic Journal of Statistics, Shaker Heights, OH : Institute of Mathematical Statistics, 2021, 15 (2), pp.5855-5887. ⟨10.1214/21-EJS1899⟩
ISSN: 1935-7524
DOI: 10.1214/21-EJS1899⟩
Popis: International audience; We are interested in recovering information on a stochastic block model from the subgraph discovered by an exploring random walk. Stochastic block models correspond to populations structured into a finite number of types, where two individuals are connected by an edge independently from the other pairs and with a probability depending on their types. We consider here the dense case where the random network can be approximated by a graphon. This problem is motivated from the study of chain-referral surveys where each interviewee provides information on her/his contacts in the social network. First, we write the likelihood of the subgraph discovered by the random walk: biases are appearing since hubs and majority types are more likely to be sampled. Even for the case where the types are observed, the maximum likelihood estimator is not explicit any more. When the types of the vertices is unobserved, we use an SAEM algorithm to maximize the likelihood. Second, we propose a different estimation strategy using new results by Athreya and Roellin. It consists in de-biasing the maximum likelihood estimator proposed in Daudin et al. and that ignores the biases.
Databáze: OpenAIRE