Transport map accelerated Markov chain Monte Carlo

We introduce a new framework for efficient sampling from complex probability distributions, using a combination of optimal transport maps and the Metropolis-Hastings rule. The core idea is to use continuous transportation to transform typical Metropolis proposal mechanisms (e.g., random walks, Langevin methods) into non-Gaussian proposal distributions that can more effectively explore the target density. Our approach adaptively constructs a lower triangular transport map-an approximation of the Knothe-Rosenblatt rearrangement-using information from previous MCMC states, via the solution of an optimization problem. This optimization problem is convex regardless of the form of the target distribution. It is solved efficiently using a Newton method that requires no gradient information from the target probability distribution; the target distribution is instead represented via samples. Sequential updates enable efficient and parallelizable adaptation of the map even for large numbers of samples. We show that this approach uses inexact or truncated maps to produce an adaptive MCMC algorithm that is ergodic for the exact target distribution. Numerical demonstrations on a range of parameter inference problems show order-of-magnitude speedups over standard MCMC techniques, measured by the number of effectively independent samples produced per target density evaluation and per unit of wallclock time.

Similar Publications

Convex sparsity-promoting regularizations are ubiquitous in modern statistical learning. By construction, they yield solutions with few non-zero coefficients, which correspond to saturated constraints in the dual optimization formulation. Working set (WS) strategies are generic optimization techniques that consist in solving simpler problems that only consider a subset of constraints, whose indices form the WS. Read More

In many modern settings, data are acquired iteratively over time, rather than all at once. Such settings are known as online, as opposed to offline or batch. We introduce a simple technique for online parameter estimation, which can operate in low memory settings, settings where data are correlated, and only requires a single inspection of the available data at each time period. Read More

This article proposes a new graphical tool, the magnitude-shape (MS) plot, for visualizing both the magnitude and shape outlyingness of multivariate functional data. The proposed tool builds on the recent notion of functional directional outlyingness, which measures the centrality of functional data by simultaneously considering the level and the direction of their deviation from the central region. The MS-plot intuitively presents not only levels but also directions of magnitude outlyingness on the horizontal axis or plane, and demonstrates shape outlyingness on the vertical axis. Read More

Kernel quadratures and other kernel-based approximation methods typically suffer from prohibitive cubic time and quadratic space complexity in the number of function evaluations. The problem arises because a system of linear equations needs to be solved. In this article we show that the weights of a kernel quadrature rule can be computed efficiently and exactly for up to tens of millions of nodes if the kernel, integration domain, and measure are fully symmetric and the node set is a union of fully symmetric sets. Read More

nimble is an R package for constructing algorithms and conducting inference on hierarchical models. The nimble package provides a unique combination of flexible model specification and the ability to program model-generic algorithms -- specifically, the package allows users to code models in the BUGS language, and it allows users to write algorithms that can be applied to any appropriately-specified BUGS model. In this paper, we introduce nimble's capabilities for state-space model analysis using Sequential Monte Carlo (SMC) techniques. Read More

Integration against an intractable probability measure is among the fundamental challenges of statistical inference, particularly in the Bayesian setting. A principled approach to this problem seeks a deterministic coupling of the measure of interest with a tractable "reference" measure (e.g. Read More

We study the convergence properties of the Gibbs Sampler in the context of posterior distributions arising from Bayesian analysis of Gaussian hierarchical models. We consider centred and non-centred parameterizations as well as their hybrids including the full family of partially non-centred parameterizations. We develop a novel methodology based on multi-grid decompositions to derive analytic expressions for the convergence rates of the algorithm for an arbitrary number of layers in the hierarchy, while previous work was typically limited to the two-level case. Read More

The marginal likelihood plays an important role in many areas of Bayesian statistics such as parameter estimation, model comparison, and model averaging. In most applications, however, the marginal likelihood is not analytically tractable and must be approximated using numerical methods. Here we provide a tutorial on bridge sampling (Bennett, 1976; Meng & Wong, 1996), a reliable and relatively straightforward sampling method that allows researchers to obtain the marginal likelihood for models of varying complexity. Read More

This study presents an innovative method for reducing the number of rating scale items without predictability loss. The "area under the re- ceiver operator curve method" (AUC ROC) is used to implement in the RatingScaleReduction package posted on CRAN. Several cases have been used to illustrate how the stepwise method has reduced the number of rating scale items (variables). Read More

Bayesian optimal experimental design has immense potential to inform the collection of data, so as to subsequently enhance our understanding of a variety of processes. However, a major impediment is the difficulty in evaluating optimal designs for problems with large, or high-dimensional, design spaces. We propose an efficient search heuristic suitable for general optimisation problems, with a particular focus on optimal Bayesian experimental design problems. Read More