Using maximum entry-wise deviation to test the goodness-of-fit for stochastic block models

The stochastic block model is widely used for detecting community structures in network data. How to test the goodness-of-fit of the model is one of the fundamental problems and has gained growing interests in recent years. In this paper, we propose a new goodness-of-fit test based on the maximum entry of the centered and re-scaled observed adjacency matrix for the stochastic block model in which the number of communities can be allowed to grow linearly with the number of nodes ignoring a logarithm factor. We demonstrate that its asymptotic null distribution is the Gumbel distribution. Our results can also be extended to the degree corrected block model. Numerical studies indicate the proposed method works well.

Comments: 23 pages

Similar Publications

Dynamic treatment regimes (DTRs) aim to formalize personalized medicine by tailoring treatment decisions to individual patient characteristics. G-estimation for DTR identification targets the parameters of a structural nested mean model known as the blip function from which the optimal DTR is derived. Despite considerable work deriving such estimation methods, there has been little focus on extending G-estimation to the case of non-additive effects, non-continuous outcomes or on model selection. Read More


Missing data are a common problem for both the construction and implementation of a prediction algorithm. Pattern mixture kernel submodels (PMKS) - a series of submodels for every missing data pattern that are fit using only data from that pattern - are a computationally efficient remedy for both stages. Here we show that PMKS yield the most predictive algorithm among all standard missing data strategies. Read More


The areas of model selection and model evaluation for predictive modeling have received extensive treatment in the statistics literature, leading to both theoretical advances and practical methods based on covariance penalties and other approaches. However, the majority of this work, and especially the practical approaches, are based on the "Fixed-X assumption", where covariate values are assumed to be non-random and known. By contrast, in most modern predictive modeling applications, it is more reasonable to take the "Random-X" view, where future prediction points are random and new. Read More


There is a growing demand for nonparametric conditional density estimators (CDEs) in fields such as astronomy and economics. In astronomy, for example, one can dramatically improve estimates of the parameters that dictate the evolution of the Universe by working with full conditional densities instead of regression (i.e. Read More


This note proposes a consistent bootstrap-based distributional approximation for cube root consistent estimators such as the maximum score estimator of Manski (1975) and the isotonic density estimator of Grenander (1956). In both cases, the standard nonparametric bootstrap is known to be inconsistent. Our method restores consistency of the nonparametric bootstrap by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. Read More


Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high-dimensional regime where the number of parameters exceeds the number of samples ($p> n$) and assume that the high-dimensional parameters vector is $s_0$ sparse. We develop a general and flexible $\ell_\infty$ projection statistic for hypothesis testing in this model. Read More


The missing phase problem in X-ray crystallography is commonly solved using the technique of molecular replacement, which borrows phases from a previously solved homologous structure, and appends them to the measured Fourier magnitudes of the diffraction patterns of the unknown structure. More recently, molecular replacement has been proposed for solving the missing orthogonal matrices problem arising in Kam's autocorrelation analysis for single particle reconstruction using X-ray free electron lasers and cryo-EM. In classical molecular replacement, it is common to estimate the magnitudes of the unknown structure as twice the measured magnitudes minus the magnitudes of the homologous structure, a procedure known as `twicing'. Read More


This paper develops a new family of estimators, MDPDEs, as a robust generalization of maximum likelihood estimator for the polytomous logistic regression model (PLRM) by using the DPD measure. Based on these estimators, the family of Wald-type test statistics for linear hypotheses is introduced and their robust properties are theoretically studied through the classical influence function analysis. Some numerical examples are presented to justify the requirement of a suitable robust statistical procedure of estimation in place of the MLE. Read More


This paper develops a new family of estimators, the minimum density power divergence estimators (MDPDEs), for the parameters of the one-shot device model as well as a new family of test statistics, Z-type test statistics based on MDPDEs, for testing the corresponding model parameters. The family of MDPDEs contains as a particular case the maximum likelihood estimator (MLE) considered in Balakrishnan and Ling (2012). Through a simulation study, it is shown that some MDPDEs have a better behavior than the MLE in relation to robustness. Read More


Hierarchical models for regionally aggregated disease incidence data commonly involve region specific latent random effects which are modelled jointly as having a multivariate Gaussian distribution. The covariance or precision matrix incorporates the spatial dependence between the regions. Common choices for the precision matrix include the widely used intrinsic conditional autoregressive model which is singular, and its nonsingular extension which lacks interpretability. Read More