# Zhaoran Wang

## Contact Details

NameZhaoran Wang |
||

Affiliation |
||

Location |
||

## Pubs By Year |
||

## Pub CategoriesStatistics - Machine Learning (12) Mathematics - Optimization and Control (3) Computer Science - Learning (3) Computer Science - Information Theory (2) Mathematics - Information Theory (2) Statistics - Methodology (1) |

## Publications Authored By Zhaoran Wang

We propose a general theory for studying the geometry of nonconvex objective functions with underlying symmetric structures. In specific, we characterize the locations of stationary points and the null space of the associated Hessian matrices via the lens of invariant groups. As a major motivating example, we apply the proposed general theory to characterize the global geometry of the low-rank matrix factorization problem. Read More

We consider the estimation and inference of sparse graphical models that characterize the dependency structure of high-dimensional tensor-valued data. To facilitate the estimation of the precision matrix corresponding to each way of the tensor, we assume the data follow a tensor normal distribution whose covariance has a Kronecker product structure. A critical challenge in the estimation and inference of this model is the fact that its penalized maximum likelihood estimation involves minimizing a non-convex objective function. Read More

We study a stochastic and distributed algorithm for nonconvex problems whose objective consists of a sum of $N$ nonconvex $L_i/N$-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into $N$ subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves $\epsilon$-stationary solution using $\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/\epsilon)$ gradient evaluations, which can be up to $\mathcal{O}(N)$ times better than the (proximal) gradient descent methods. Read More

Sparse generalized eigenvalue problem plays a pivotal role in a large family of high-dimensional learning tasks, including sparse Fisher's discriminant analysis, canonical correlation analysis, and sufficient dimension reduction. However, the theory of sparse generalized eigenvalue problem remains largely unexplored. In this paper, we exploit a non-convex optimization perspective to study this problem. Read More

We study the fundamental tradeoffs between computational tractability and statistical accuracy for a general family of hypothesis testing problems with combinatorial structures. Based upon an oracle model of computation, which captures the interactions between algorithms and data, we establish a general lower bound that explicitly connects the minimum testing risk under computational budget constraints with the intrinsic probabilistic and combinatorial structures of statistical problems. This lower bound mirrors the classical statistical lower bound by Le Cam (1986) and allows us to quantify the optimal statistical performance achievable given limited computational budgets in a systematic fashion. Read More

We study parameter estimation and asymptotic inference for sparse nonlinear regression. More specifically, we assume the data are given by $y = f( x^\top \beta^* ) + \epsilon$, where $f$ is nonlinear. To recover $\beta^*$, we propose an $\ell_1$-regularized least-squares estimator. Read More

Linear regression studies the problem of estimating a model parameter $\beta^* \in \mathbb{R}^p$, from $n$ observations $\{(y_i,\mathbf{x}_i)\}_{i=1}^n$ from linear model $y_i = \langle \mathbf{x}_i,\beta^* \rangle + \epsilon_i$. We consider a significant generalization in which the relationship between $\langle \mathbf{x}_i,\beta^* \rangle$ and $y_i$ is noisy, quantized to a single bit, potentially nonlinear, noninvertible, as well as unknown. This model is known as the single-index model in statistics, and, among other things, it represents a significant generalization of one-bit compressed sensing. Read More

Many high dimensional sparse learning problems are formulated as nonconvex optimization. A popular approach to solve these nonconvex optimization problems is through convex relaxations such as linear and semidefinite programming. In this paper, we study the statistical limits of convex relaxations. Read More

We provide a general theory of the expectation-maximization (EM) algorithm for inferring high dimensional latent variable models. In particular, we make two contributions: (i) For parameter estimation, we propose a novel high dimensional EM algorithm which naturally incorporates sparsity structure into parameter estimation. With an appropriate initialization, this algorithm converges at a geometric rate and attains an estimator with the (near-)optimal statistical rate of convergence. Read More

Sparse principal component analysis (PCA) involves nonconvex optimization for which the global solution is hard to obtain. To address this issue, one popular approach is convex relaxation. However, such an approach may produce suboptimal estimators due to the relaxation effect. Read More

We study sparse principal component analysis for high dimensional vector autoregressive time series under a doubly asymptotic framework, which allows the dimension $d$ to scale with the series length $T$. We treat the transition matrix of time series as a nuisance parameter and directly apply sparse principal component analysis on multivariate time series as if the data are independent. We provide explicit non-asymptotic rates of convergence for leading eigenvector estimation and extend this result to principal subspace estimation. Read More

We provide theoretical analysis of the statistical and computational properties of penalized $M$-estimators that can be formulated as the solution to a possibly nonconvex optimization problem. Many important estimators fall in this category, including least squares regression with nonconvex regularization, generalized linear models with nonconvex regularization and sparse elliptical random design regression. For these problems, it is intractable to calculate the global solution due to the nonconvex formulation. Read More