Inference via low-dimensional couplings

Integration against an intractable probability measure is among the fundamental challenges of statistical inference, particularly in the Bayesian setting. A principled approach to this problem seeks a deterministic coupling of the measure of interest with a tractable "reference" measure (e.g., a standard Gaussian). This coupling is induced by a transport map, and enables direct simulation from the desired measure simply by evaluating the transport map at samples from the reference. Yet characterizing such a map---e.g., representing and evaluating it---grows challenging in high dimensions. The central contribution of this paper is to establish a link between the Markov properties of the target measure and the existence of certain low-dimensional couplings, induced by transport maps that are sparse or decomposable. Our analysis not only facilitates the construction of couplings in high-dimensional settings, but also suggests new inference methodologies. For instance, in the context of nonlinear and non-Gaussian state space models, we describe new online and single-pass variational algorithms that characterize the full posterior distribution of the sequential inference problem using operations only slightly more complex than regular filtering.


Similar Publications

Thermodynamic integration (TI) for computing marginal likelihoods is based on an inverse annealing path from the prior to the posterior distribution. In many cases, the resulting estimator suffers from high variability, which particularly stems from the prior regime. When comparing complex models with differences in a comparatively small number of parameters, intrinsic errors from sampling fluctuations may outweigh the differences in the log marginal likelihood estimates. Read More


Sufficient dimension reduction (SDR) is continuing an active research field nowadays for high dimensional data. It aims to estimate the central subspace (CS) without making distributional assumption. To overcome the large-$p$-small-$n$ problem we propose a new approach for SDR. Read More


It is generally accepted that all models are wrong -- the difficulty is determining which are useful. Here, a useful model is considered as one that is capable of combining data and expert knowledge, through an inversion or calibration process, to adequately characterize the uncertainty in predictions of interest. This paper derives conditions that specify which simplified models are useful and how they should be calibrated. Read More


Variational inference methods for latent variable statistical models have gained popularity because they are relatively fast, can handle large data sets, and have deterministic convergence guarantees. However, in practice it is unclear whether the fixed point identified by the variational inference algorithm is a local or a global optimum. Here, we propose a method for constructing iterative optimization algorithms for variational inference problems that are guaranteed to converge to the $\epsilon$-global variational lower bound on the log-likelihood. Read More


The problems of computational data processing involving regression, interpolation, reconstruction and imputation for multidimensional big datasets are becoming more important these days, because of the availability of data and their widely spread usage in business, technological, scientific and other applications. The existing methods often have limitations, which either do not allow, or make it difficult to accomplish many data processing tasks. The problems usually relate to algorithm accuracy, applicability, performance (computational and algorithmic), demands for computational resources, both in terms of power and memory, and difficulty working with high dimensions. Read More


Conditional density estimation (density regression) estimates the distribution of a response variable y conditional on covariates x. Utilizing a partition model framework, a conditional density estimation method is proposed using logistic Gaussian processes. The partition is created using a Voronoi tessellation and is learned from the data using a reversible jump Markov chain Monte Carlo algorithm. Read More


The popularity of online surveys has increased the prominence of sampling weights in claims of representativeness. Yet, much uncertainty remains regarding how these weights should be employed in the analysis of survey experiments: Should they be used or ignored? If they are used, which estimators are preferred? We offer practical advice, rooted in the Neyman-Rubin model, for researchers producing and working with survey experimental data. We examine simple, efficient estimators (Horvitz-Thompson, H\`ajek, "double-H\`ajek", and post-stratification) for analyzing these data, along with formulae for biases and variances. Read More


Many application domains such as ecology or genomics have to deal with multivariate non Gaussian observations. A typical example is the joint observation of the respective abundances of a set of species in a series of sites, aiming to understand the co-variations between these species. The Gaussian setting provides a canonical way to model such dependencies, but does not apply in general. Read More


We describe a way to construct hypothesis tests and confidence intervals after having used the Lasso for feature selection, allowing the regularization parameter to be chosen via an estimate of prediction error. Our estimate of prediction error is a slight variation on cross-validation. Using this variation, we are able to describe an appropriate selection event for choosing a parameter by cross-validation. Read More


The stochastic block model is widely used for detecting community structures in network data. How to test the goodness-of-fit of the model is one of the fundamental problems and has gained growing interests in recent years. In this paper, we propose a new goodness-of-fit test based on the maximum entry of the centered and re-scaled observed adjacency matrix for the stochastic block model in which the number of communities can be allowed to grow linearly with the number of nodes ignoring a logarithm factor. Read More