Inference via low-dimensional couplings

Integration against an intractable probability measure is among the fundamental challenges of statistical inference, particularly in the Bayesian setting. A principled approach to this problem seeks a deterministic coupling of the measure of interest with a tractable "reference" measure (e.g., a standard Gaussian). This coupling is induced by a transport map, and enables direct simulation from the desired measure simply by evaluating the transport map at samples from the reference. Yet characterizing such a map---e.g., representing and evaluating it---grows challenging in high dimensions. The central contribution of this paper is to establish a link between the Markov properties of the target measure and the existence of certain low-dimensional couplings, induced by transport maps that are sparse or decomposable. Our analysis not only facilitates the construction of couplings in high-dimensional settings, but also suggests new inference methodologies. For instance, in the context of nonlinear and non-Gaussian state space models, we describe new online and single-pass variational algorithms that characterize the full posterior distribution of the sequential inference problem using operations only slightly more complex than regular filtering.

Similar Publications

Random-effects meta-analyses are very commonly used in medical statistics. Recent methodological developments include multivariate (multiple outcomes) and network (multiple treatments) meta-analysis. Here we provide a new model and corresponding estimation procedure for multivariate network meta-analysis, so that multiple outcomes and treatments can be included in a single analysis. Read More

Many real datasets contain values missing not at random (MNAR). In this scenario, investigators often perform list-wise deletion, or delete samples with any missing values, before applying causal discovery algorithms. List-wise deletion is a sound and general strategy when paired with algorithms such as FCI and RFCI, but the deletion procedure also eliminates otherwise good samples that contain only a few missing values. Read More

This paper presents a detailed theoretical analysis of the Langevin Monte Carlo sampling algorithm recently introduced in Durmus et al. (Efficient Bayesian computation by proximal Markov chain Monte Carlo: when Langevin meets Moreau, 2016) when applied to log-concave probability distributions that are restricted to a convex body $\mathsf{K}$. This method relies on a regularisation procedure involving the Moreau-Yosida envelope of the indicator function associated with $\mathsf{K}$. Read More

Consider the partially linear model (PLM) with random design: $Y=X^T\beta^*+g(W)+u$, where $g(\cdot)$ is an unknown real-valued function, $X$ is $p$-dimensional, $W$ is one-dimensional, and $\beta^*$ is $s$-sparse. Our aim is to efficiently estimate $\beta^*$ based on $n$ i.i. Read More

We propose a novel continuous testing framework to test the intensities of Poisson Processes. This framework allows a rigorous definition of the complete testing procedure, from an infinite number of hypothesis to joint error rates. Our work extends traditional procedures based on scanning windows, by controlling the family-wise error rate and the false discovery rate in a non-asymptotic manner and in a continuous way. Read More

There has recently been a growing interest in the development of statistical methods to compare medical costs between treatment groups. When cumulative cost is the outcome of interest, right-censoring poses the challenge of informative missingness due to heterogeneity in the rates of cost accumulation across subjects. Existing approaches seeking to address the challenge of informative cost trajectories typically rely on inverse probability weighting and target a net "intent-to-treat" effect. Read More

The varying-coefficient model is a strong tool for the modelling of interactions in generalized regression. It is easy to apply if both the variables that are modified as well as the effect modifiers are known. However, in general one has a set of explanatory variables and it is unknown which variables are modified by which covariates. Read More

The use of sparse precision (inverse covariance) matrices has become popular because they allow for efficient algorithms for joint inference in high-dimensional models. Many applications require the computation of certain elements of the covariance matrix, such as the marginal variances, which may be non-trivial to obtain when the dimension is large. This paper introduces a fast Rao-Blackwellized Monte Carlo sampling based method for efficiently approximating selected elements of the covariance matrix. Read More

We study multiply robust (MR) estimators of the longitudinal g-computation formula of Robins (1986). In the first part of this paper we review and extend the recently proposed parametric multiply robust estimators of Tchetgen-Tchetgen (2009) and Molina, Rotnitzky, Sued and Robins (2017). In the second part of the paper we derive multiply and doubly robust estimators that use non-parametric machine-learning (ML) estimators of nuisance functions in lieu of parametric models. Read More

Community detection is an fundamental unsupervised learning problem for unlabeled networks which has a broad range of applications. Typically, community detection algorithms assume that the number of clusters $r$ is known apriori. While provable algorithms for finding $r$ has recently garnered much attention from the theoretical statistics community, existing methods often make strong model assumptions about the separation between clusters or communities. Read More