Statistics - Theory Publications (50)


Statistics - Theory Publications

This paper is the first chapter of three of the author's undergraduate thesis. We study the random matrix ensemble of covariance matrices arising from random $(d_b, d_w)$-regular bipartite graphs on a set of $M$ black vertices and $N$ white vertices, for $d_b \gg \log^4 N$. We simultaneously prove that the Green's functions of these covariance matrices and the adjacency matrices of the underlying graphs agree with the corresponding limiting law (e. Read More

We introduce Lipschitz-Killing curvature (LKC) regression, a new method to produce $(1-\alpha)$ thresholds for signal detection in random fields that does not require knowledge of the spatial correlation structure. The idea is to fit observed empirical Euler characteristics to the Gaussian kinematic formula via generalized least squares, which quickly and easily provides statistical estimates of the LKCs --- complex topological quantities that can be extremely challenging to compute, both theoretically and numerically. With these estimates, we can then make use of a powerful parametric approximation via Euler characteristics for Gaussian random fields to generate accurate $(1-\alpha)$ thresholds and $p$-values. Read More

There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e.g. Nesterov's acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in d'Aspremont 2008 and Devolder, Glineur, and Nesterov 2014. Read More

This note proposes a consistent bootstrap-based distributional approximation for cube root consistent estimators such as the maximum score estimator of Manski (1975) and the isotonic density estimator of Grenander (1956). In both cases, the standard nonparametric bootstrap is known to be inconsistent. Our method restores consistency of the nonparametric bootstrap by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. Read More

In this paper we propose a new method of joint nonparametric estimation of probability density and its support. As is well known, nonparametric kernel density estimator has "boundary bias problem" when the support of the population density is not the whole real line. To avoid the unknown boundary effects, our estimator detects the boundary, and eliminates the boundary-bias of the estimator simultaneously. Read More

We propose new smoothed median and the Wilcoxon's rank sum test. As is pointed out by Maesono et al.(2016), some nonparametric discrete tests have a problem with their significance probability. Read More

Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high-dimensional regime where the number of parameters exceeds the number of samples ($p> n$) and assume that the high-dimensional parameters vector is $s_0$ sparse. We develop a general and flexible $\ell_\infty$ projection statistic for hypothesis testing in this model. Read More

In this article we explore an algorithm for diffeomorphic random sampling of nonuniform probability distributions on Riemannian manifolds. The algorithm is based on optimal information transport (OIT)---an analogue of optimal mass transport (OMT). Our framework uses the deep geometric connections between the Fisher-Rao metric on the space of probability densities and the right-invariant information metric on the group of diffeomorphisms. Read More

We offer an umbrella type result which extends the convergence of classical empirical process on the line to more general processes indexed by functions of bounded variation. This extension is not contingent on the type of dependence of the underlying sequence of random variables. As a consequence we establish the weak convergence for stationary empirical processes indexed by general classes of functions under alpha mixing conditions. Read More

Advances in mobile computing technologies have made it possible to monitor and apply data-driven interventions across complex systems in real time. Markov decision processes (MDPs) are the primary model for sequential decision problems with a large or indefinite time horizon. Choosing a representation of the underlying decision process that is both Markov and low-dimensional is non-trivial. Read More

We offer a general Bayes theoretic framework to tackle the model selection problem under a two-step prior design: the first-step prior serves to assess the model selection uncertainty, and the second-step prior quantifies the prior belief on the strength of the signals within the model chosen from the first step. We establish non-asymptotic oracle posterior contraction rates under (i) a new Bernstein-inequality condition on the log likelihood ratio of the statistical experiment, (ii) a local entropy condition on the dimensionality of the models, and (iii) a sufficient mass condition on the second-step prior near the best approximating signal for each model. The first-step prior can be designed generically. Read More

The multivariate linear regression model with shuffled data and additive Gaussian noise arises in various correspondence estimation and matching problems. Focusing on the denoising aspect of this problem, we provide a characterization the minimax error rate that is sharp up to logarithmic factors. We also analyze the performance of two versions of a computationally efficient estimator, and establish their consistency for a large range of input parameters. Read More

Additive regression provides an extension of linear regression by modeling the signal of a response as a sum of functions of covariates of relatively low complexity. We study penalized estimation in high-dimensional nonparametric additive regression where functional semi-norms are used to induce smoothness of component functions and the empirical $L_2$ norm is used to induce sparsity. The functional semi-norms can be of Sobolev or bounded variation types and are allowed to be different amongst individual component functions. Read More

Complex computer codes are often too time expensive to be directly used to perform uncertainty, sensitivity, optimization and robustness analyses. A widely accepted method to circumvent this problem consists in replacing cpu-time expensive computer models by cpu inexpensive mathematical functions, called metamodels. For example, the Gaussian process (Gp) model has shown strong capabilities to solve practical problems , often involving several interlinked issues. Read More

The multivariate linear regression model is an important tool for investigating relationships between several response variables and several predictor variables. The primary interest is in inference about the unknown regression coefficient matrix. We propose multivariate bootstrap techniques as a means for making inferences about the unknown regression coefficient matrix. Read More

This paper discusses direct sampling methods (i.e., non MCMC methods) from discrete target distributions. Read More

For linear inverse problem with Gaussian random noise we show that Tikhonov regularization algorithm is minimax in the class of linear estimators and is asymptotically minimax in the sense of sharp asymptotic in the class of all estimators. The results are valid if some a priori information on a Fourier coefficients of solution is provided. For trigonometric basis this a priori information implies that the solution belongs to a ball in Besov space $B^r_{2\infty}$. Read More

Variable clustering is one of the most important unsupervised learning methods, ubiquitous in most research areas. In the statistics and computer science literature, most of the clustering methods lead to non-overlapping partitions of the variables. However, in many applications, some variables may belong to multiple groups, yielding clusters with overlap. Read More

In this paper, a new goodness-of-fit test for a location-scale family based on progressively Type-II censored order statistics is proposed. Using Monte Carlo simulation studies, the present researchers have observed that the proposed test for normality is consistent and quite powerful in comparison with existing goodness-of-fit tests based on progressively Type-II censored data. Also, the new test statistic for a real data set is used and the results show that our new test statistic performs well. Read More

The models considered in this paper are a special subclass of Relational models which may be appropriate when a collection of independence statements must hold even after probabilities are re-scaled to sum to 1. After reviewing the basic properties of these models and deriving some new ones, two algorithms for computing maximum likelihood estimates are presented. Some new light is also thrown on the underlying geometry. Read More

We study the problem of testing for structure in networks using relations between the observed frequencies of small subgraphs. We consider the statistics \begin{align*} T_3 & =(\text{edge frequency})^3 - \text{triangle frequency}\\ T_2 & =3(\text{edge frequency})^2(1-\text{edge frequency}) - \text{V-shape frequency} \end{align*} and prove a central limit theorem for $(T_2, T_3)$ under an Erd\H{o}s-R\'{e}nyi null model. We then analyze the power of the associated $\chi^2$ test statistic under a general class of alternative models. Read More

In this paper, we propose several statistics for testing uniformity under progressive Type-I interval censoring. We obtain the critical points of these statistics and study the power of the proposed tests against a representative set of alternatives via simulation. Finally, we generalize our methods for continuous and completely specified distributions. Read More

Latent Block Model (LBM) is a model-based method to cluster simultaneously the $d$ columns and $n$ rows of a data matrix. Parameter estimation in LBM is a difficult and multifaceted problem. Although various estimation strategies have been proposed and are now well understood empirically, theoretical guarantees about their asymptotic behavior is rather sparse. Read More

An extensive empirical literature documents a generally negative correlation, named the "leverage effect" between asset returns and changes of volatility. It is more challenging to establish such a return-volatility relationship for jumps in high-frequency data. We propose new nonparametric methods to assess and test for a discontinuous leverage effect --- that is, a relation between contemporaneous jumps in prices and volatility --- in high-frequency data with market microstructure noise. Read More

We present a functional form of the Erd\"os-Renyi law of large numbers for Levy processes. Read More

This article improves the existing proven rates of regret decay in optimal policy estimation. We give a margin-free result showing that the regret decay for estimating a within-class optimal policy is second-order for empirical risk minimizers over Donsker classes, with regret decaying at a faster rate than the standard error of an efficient estimator of the value of an optimal policy. We also give a result from the classification literature that shows that faster regret decay is possible via plug-in estimation provided a margin condition holds. Read More

This paper deals with asymptotics for multiple-set linear canonical analysis (MSLCA). A definition of this analysis, that adapts the classical one to the context of Euclidean random variables, is given and properties of the related canonical coefficients are derived. Then, estimators of the MSLCA's elements, based on empirical covariance operators, are proposed and asymptotics for these estimators are obtained. Read More

Consider the classical Gaussian unitary ensemble of size $N$ and the real Wishart ensemble $W_N(n,I)$. In the limits as $N \to \infty$ and $N/n \to \gamma > 0$, the expected number of eigenvalues that exit the upper bulk edge is less than one, 0.031 and 0. Read More

In this paper, we discuss stochastic comparisons of parallel systems with independent heterogeneous exponentiated Nadarajah-Haghighi (ENH) components in terms of the usual stochastic order, dispersive order, convex transform order and the likelihood ratio order. In the presence of the Archimedean copula, we study stochastic comparison of series dependent systems in terms of the usual stochastic order. Read More

Distributional approximations of (bi--) linear functions of sample variance-covariance matrices play a critical role to analyze vector time series, as they are needed for various purposes, especially to draw inference on the dependence structure in terms of second moments and to analyze projections onto lower dimensional spaces as those generated by principal components. This particularly applies to the high-dimensional case, where the dimension $d$ is allowed to grow with the sample size $n$ and may even be larger than $n$. We establish large-sample approximations for such bilinear forms related to the sample variance-covariance matrix of a high-dimensional vector time series in terms of strong approximations by Brownian motions. Read More

We propose halfspace depth concepts for scatter, concentration and shape matrices. For scatter matrices, our concept extends the one from Chen, Gao and Ren (2015) to the non-centered case, and is in the same spirit as the one in Zhang (2002). Rather than focusing, as in these earlier works, on deepest scatter matrices, we thoroughly investigate the properties of the proposed depth and of the corresponding depth regions. Read More

Although there is no shortage of clustering algorithms proposed in the literature, the question of the most relevant strategy for clustering compositional data (i.e., data made up of profiles, whose rows belong to the simplex) remains largely unexplored in cases where the observed value of an observation is equal or close to zero for one or more samples. Read More

We consider the "searching for a trail in a maze" composite hypothesis testing problem, in which one attempts to detect an anomalous directed path in a lattice 2D box of side n based on observations on the nodes of the box. Under the signal hypothesis, one observes independent Gaussian variables of unit variance at all nodes, with zero, mean off the anomalous path and mean \mu_n on it. Under the null hypothesis, one observes i. Read More

The adaptive classification of the interference covariance matrix structure for radar signal processing applications is addressed in this paper. This represents a key issue because many detection architectures are synthesized assuming a specific covariance structure which may not necessarily coincide with the actual one due to the joint action of the system and environment uncertainties. The considered classification problem is cast in terms of a multiple hypotheses test with some nested alternatives and the theory of Model Order Selection (MOS) is exploited to devise suitable decision rules. Read More

Complex Ornstein-Uhlenbeck (OU) processes have various applications in statistical modelling. They play role e.g. Read More

Regularly varying stochastic processes model extreme dependence between process values at different locations and/or time points. For such processes we propose a two-step parameter estimation of the extremogram, when some part of the domain of interest is fixed and another increasing. We provide conditions for consistency and asymptotic normality of the empirical extremogram centred by a pre-asymptotic version for such observation schemes. Read More

A special class of standard Gaussian Autoregressive Hilbertian processes of order one (Gaussian ARH(1) processes), with bounded linear autocorrelation operator, which does not satisfy the usual Hilbert-Schmidt assumption, is considered. To compensate the slow decay of the diagonal coefficients of the autocorrelation operator, a faster decay velocity of the eigenvalues of the trace autocovariance operator of the innovation process is assumed. As usual, the eigenvectors of the autocovariance operator of the ARH(1) process are considered for projection, since, here, they are assumed to be known. Read More

We propose a new mathematical model for $n-k$-dimensional non-linear correlations with intrinsic scatter in $n$-dimensional data. The model is based on Riemannian geometry, and is naturally invariant under coordinate transformations. We combine the model with a Bayesian approach for estimating the parameters of the correlation relation and the intrinsic scatter. Read More

Speckle reduction is a longstanding topic in synthetic aperture radar (SAR) imaging. Since most current and planned SAR imaging satellites operate in polarimetric, interferometric or tomographic modes, SAR images are multi-channel and speckle reduction techniques must jointly process all channels to recover polarimetric and interferometric information. The distinctive nature of SAR signal (complex-valued, corrupted by multiplicative fluctuations) calls for the development of specialized methods for speckle reduction. Read More

We consider a robust analog of the planted clique problem. In this analog, a set $S$ of vertices is chosen and all edges in $S$ are included; then, edges between $S$ and the rest of the graph are included with probability $\frac{1}{2}$, while edges not touching $S$ are allowed to vary arbitrarily. For this semi-random model, we show that the information-theoretic threshold for recovery is $\tilde{\Theta}(\sqrt{n})$, in sharp contrast to the classical information-theoretic threshold of $\Theta(\log(n))$. Read More

In this paper, we propose a new method for estimation and constructing confidence intervals for low-dimensional components in a high-dimensional model. The proposed estimator, called Constrained Lasso (CLasso) estimator, is obtained by simultaneously solving two estimating equations---one imposing a zero-bias constraint for the low-dimensional parameter and the other forming an $\ell_1$-penalized procedure for the high-dimensional nuisance parameter. By carefully choosing the zero-bias constraint, the resulting estimator of the low dimensional parameter is shown to admit an asymptotically normal limit attaining the Cram\'{e}r-Rao lower bound in a semiparametric sense. Read More

In this work we define log-linear models to compare several square contingency tables under the quasi-independence or the quasi-symmetry model, and the relevant Markov bases are theoretically characterized. Through Markov bases, an exact test to evaluate if two or more tables fit a common model is introduced. Two real-data examples illustrate the use of these models in different fields of applications. Read More

In this paper, we revisit the recently established theoretical guarantees for the convergence of the Langevin Monte Carlo algorithm of sampling from a smooth and (strongly) log-concave density. We improve the existing results when the convergence is measured in the Wasserstein distance and provide further insights on the very tight relations between, on the one hand, the Langevin Monte Carlo for sampling and, on the other hand, the gradient descent for optimization. Finally, we also establish guarantees for the convergence of a version of the Langevin Monte Carlo algorithm that is based on noisy evaluations of the gradient. Read More

Loosely speaking, the Shannon entropy rate is used to gauge a stochastic process' intrinsic randomness; the statistical complexity gives the cost of predicting the process. We calculate, for the first time, the entropy rate and statistical complexity of stochastic processes generated by finite unifilar hidden semi-Markov models---memoryful, state-dependent versions of renewal processes. Calculating these quantities requires introducing novel mathematical objects ({\epsilon}-machines of hidden semi-Markov processes) and new information-theoretic methods to stochastic processes. Read More

In this paper, we focus on the COM-type negative binomial distribution with three parameters, which belongs to COM-type $(a,b,0)$ class distributions and family of equilibrium distributions of arbitrary birth-death process. Besides, we show abundant distributional properties such as overdispersion and underdispersion, log-concavity, log-convexity (infinite divisibility), pseudo compound Poisson, stochastic ordering and asymptotic approximation. Some characterizations including sum of equicorrelated geometrically distributed r. Read More

This paper investigates the problem of detecting relevant change points in the mean vector, say $\mu_t =(\mu_{1,t},\ldots ,\mu_{d,t})^T$ of a high dimensional time series $(Z_t)_{t\in \Z}$. While the recent literature on testing for change points in this context considers hypotheses for the equality of the means $\mu_h^{(1)}$ and $\mu_h^{(2)}$ before and after the change points in the different components, we are interested in a null hypothesis of the form $$ H_0: |\mu^{(1)}_{h} - \mu^{(2)}_{h} | \leq \Delta_h ~~~\mbox{ for all } ~~h=1,\ldots ,d $$ where $\Delta_1, \ldots , \Delta_d$ are given thresholds for which a smaller difference of the means in the $h$-th component is considered to be non-relevant. We propose a new test for this problem based on the maximum of squared and integrated CUSUM statistics and investigate its properties as the sample size $n$ and the dimension $d$ both converge to infinity. Read More

This paper continues the research started in \cite{LW16}. In the framework of the convolution structure density model on $\bR^d$, we address the problem of adaptive minimax estimation with $\bL_p$--loss over the scale of anisotropic Nikol'skii classes. We fully characterize the behavior of the minimax risk for different relationships between regularity parameters and norm indexes in the definitions of the functional class and of the risk. Read More

We study the problem of nonparametric estimation under $\bL_p$-loss, $p\in [1,\infty)$, in the framework of the convolution structure density model on $\bR^d$. This observation scheme is a generalization of two classical statistical models, namely density estimation under direct and indirect observations. In Part I the original pointwise selection rule from a family of "kernel-type" estimators is proposed. Read More

Binomial random intersection graphs can be used as parsimonious statistical models of large and sparse networks, with one parameter for the average degree and another for transitivity, the tendency of neighbours of a node to be connected. This paper discusses the estimation of these parameters from a single observed instance of the graph, using moment estimators based on observed degrees and frequencies of 2-stars and triangles. The observed data set is assumed to be a subgraph induced by a set of $n_0$ nodes sampled from the full set of $n$ nodes. Read More