Zobrazeno 1 - 10
of 125
pro vyhledávání: '"Ke, Zheng Tracy"'
We are interested in the problem of two-sample network hypothesis testing: given two networks with the same set of nodes, we wish to test whether the underlying Bernoulli probability matrices of the two networks are the same or not. We propose Interl
Externí odkaz:
http://arxiv.org/abs/2408.06987
Autor:
Ke, Zheng Tracy, Wang, Jingming
Publikováno v:
MDPI Mathematics, 2024
Topic modeling is a widely utilized tool in text analysis. We investigate the optimal rate for estimating a topic model. Specifically, we consider a scenario with $n$ documents, a vocabulary of size $p$, and document lengths at the order $N$. When $N
Externí odkaz:
http://arxiv.org/abs/2405.17806
Given a $K$-vertex simplex in a $d$-dimensional space, suppose we measure $n$ points on the simplex with noise (hence, some of the observed points fall outside the simplex). Vertex hunting is the problem of estimating the $K$ vertices of the simplex.
Externí odkaz:
http://arxiv.org/abs/2403.11013
Publikováno v:
Annual Review of Statistics and Its Application 2024 11:1
Text analysis is an interesting research area in data science and has various applications, such as in artificial intelligence, biomedical research, and engineering. We review popular methods for text analysis, ranging from topic modeling to the rece
Externí odkaz:
http://arxiv.org/abs/2401.00775
Subject clustering (i.e., the use of measured features to cluster subjects, such as patients or cells, into multiple groups) is a problem of great interest. In recent years, many approaches were proposed, among which unsupervised deep learning (UDL)
Externí odkaz:
http://arxiv.org/abs/2306.05363
How to detect a small community in a large network is an interesting problem, including clique detection as a special case, where a naive degree-based $\chi^2$-test was shown to be powerful in the presence of an Erd\H{o}s-Renyi background. Using Sink
Externí odkaz:
http://arxiv.org/abs/2303.05024
Motivated by applications in text mining and discrete distribution inference, we investigate the testing for equality of probability mass functions of $K$ groups of high-dimensional multinomial distributions. A test statistic, which is shown to have
Externí odkaz:
http://arxiv.org/abs/2301.01381
Autor:
Ke, Zheng Tracy, Wang, Jingming
Real networks often have severe degree heterogeneity, with the maximum, average, and minimum node degrees differing significantly. This paper examines the impact of degree heterogeneity on statistical limits of network data analysis. Introducing the
Externí odkaz:
http://arxiv.org/abs/2204.12087
We collected and cleaned a large data set on publications in statistics. The data set consists of the coauthor relationships and citation relationships of 83, 331 papers published in 36 representative journals in statistics, probability, and machine
Externí odkaz:
http://arxiv.org/abs/2204.11194
Autor:
Cammarata, Louis, Ke, Zheng Tracy
The mixed-membership stochastic block model (MMSBM) is a common model for social networks. Given an $n$-node symmetric network generated from a $K$-community MMSBM, we would like to test $K=1$ versus $K>1$. We first study the degree-based $\chi^2$ te
Externí odkaz:
http://arxiv.org/abs/2204.11109