# Guaranteed Tensor PCA with Optimality in Statistics and Computation

Tensors, or high-order arrays, attract much attention in recent research. In this paper, we propose a general framework for tensor principal component analysis (tensor PCA), which focuses on the methodology and theory for extracting the hidden low-rank structure from the high-dimensional tensor data. A unified solution is provided for tensor PCA with considerations in both statistical limits and computational costs. The problem exhibits three different phases according to the signal-noise-ratio (SNR). In particular, with strong SNR, we propose a fast spectral power iteration method that achieves the minimax optimal rate of convergence in estimation; with weak SNR, the information-theoretical lower bound shows that it is impossible to have consistent estimation in general; with moderate SNR, we show that the non-convex maximum likelihood estimation provides optimal solution, but with NP-hard computational cost; moreover, under the hardness hypothesis of hypergraphic planted clique detection, there are no polynomial-time algorithms performing consistently in general. Simulation studies show that the proposed spectral power iteration method have good performance under a variety of settings.

## Similar Publications

There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e.g. Nesterov's acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in d'Aspremont 2008 and Devolder, Glineur, and Nesterov 2014. Read More

This note proposes a consistent bootstrap-based distributional approximation for cube root consistent estimators such as the maximum score estimator of Manski (1975) and the isotonic density estimator of Grenander (1956). In both cases, the standard nonparametric bootstrap is known to be inconsistent. Our method restores consistency of the nonparametric bootstrap by altering the shape of the criterion function defining the estimator whose distribution we seek to approximate. Read More

In this paper we propose a new method of joint nonparametric estimation of probability density and its support. As is well known, nonparametric kernel density estimator has "boundary bias problem" when the support of the population density is not the whole real line. To avoid the unknown boundary effects, our estimator detects the boundary, and eliminates the boundary-bias of the estimator simultaneously. Read More

We propose new smoothed median and the Wilcoxon's rank sum test. As is pointed out by Maesono et al.(2016), some nonparametric discrete tests have a problem with their significance probability. Read More

Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high-dimensional regime where the number of parameters exceeds the number of samples ($p> n$) and assume that the high-dimensional parameters vector is $s_0$ sparse. We develop a general and flexible $\ell_\infty$ projection statistic for hypothesis testing in this model. Read More

In this article we explore an algorithm for diffeomorphic random sampling of nonuniform probability distributions on Riemannian manifolds. The algorithm is based on optimal information transport (OIT)---an analogue of optimal mass transport (OMT). Our framework uses the deep geometric connections between the Fisher-Rao metric on the space of probability densities and the right-invariant information metric on the group of diffeomorphisms. Read More

We offer an umbrella type result which extends the convergence of classical empirical process on the line to more general processes indexed by functions of bounded variation. This extension is not contingent on the type of dependence of the underlying sequence of random variables. As a consequence we establish the weak convergence for stationary empirical processes indexed by general classes of functions under alpha mixing conditions. Read More

Advances in mobile computing technologies have made it possible to monitor and apply data-driven interventions across complex systems in real time. Markov decision processes (MDPs) are the primary model for sequential decision problems with a large or indefinite time horizon. Choosing a representation of the underlying decision process that is both Markov and low-dimensional is non-trivial. Read More

We offer a general Bayes theoretic framework to tackle the model selection problem under a two-step prior design: the first-step prior serves to assess the model selection uncertainty, and the second-step prior quantifies the prior belief on the strength of the signals within the model chosen from the first step. We establish non-asymptotic oracle posterior contraction rates under (i) a new Bernstein-inequality condition on the log likelihood ratio of the statistical experiment, (ii) a local entropy condition on the dimensionality of the models, and (iii) a sufficient mass condition on the second-step prior near the best approximating signal for each model. The first-step prior can be designed generically. Read More

The multivariate linear regression model with shuffled data and additive Gaussian noise arises in various correspondence estimation and matching problems. Focusing on the denoising aspect of this problem, we provide a characterization the minimax error rate that is sharp up to logarithmic factors. We also analyze the performance of two versions of a computationally efficient estimator, and establish their consistency for a large range of input parameters. Read More