Rina Foygel Barber

Rina Foygel Barber
Are you Rina Foygel Barber?

Claim your profile, edit publications, add additional information:

Contact Details

Name
Rina Foygel Barber
Affiliation
Location

Pubs By Year

Pub Categories

 
Statistics - Methodology (10)
 
Mathematics - Statistics (9)
 
Statistics - Theory (9)
 
Statistics - Machine Learning (2)
 
Mathematics - Optimization and Control (2)
 
Mathematics - Information Theory (1)
 
Computer Science - Information Theory (1)
 
Physics - Medical Physics (1)
 
Computer Science - Learning (1)

Publications Authored By Rina Foygel Barber

Many problems in high-dimensional statistics and optimization involve minimization over nonconvex constraints-for instance, a rank constraint for a matrix estimation problem-but little is known about the theoretical properties of such optimization problems for a general nonconvex constraint set. In this paper we study the interplay between the geometric properties of the constraint set and the convergence behavior of gradient descent for minimization over this set. We develop the notion of local concavity coefficients of the constraint set, measuring the extent to which convexity is violated, which govern the behavior of projected gradient descent over this set. Read More

A significant literature has arisen to study ways to employing prior knowledge to improve power and precision of multiple testing procedures. Some common forms of prior knowledge may include (a) a priori beliefs about which hypotheses are null, modeled by non-uniform prior weights; (b) differing importances of hypotheses, modeled by differing penalties for false discoveries; (c) partitions of the hypotheses into known groups, indicating (dis)similarity of hypotheses; and (d) knowledge of independence, positive dependence or arbitrary dependence between hypotheses or groups, allowing for more aggressive or conservative procedures. We present a general framework for global null testing and false discovery rate (FDR) control that allows the scientist to incorporate all four types of prior knowledge (a)-(d) simultaneously. Read More

In regression, conformal prediction is a general methodology to construct prediction intervals in a distribution-free manner. Although conformal prediction guarantees strong statistical property for predictive inference, its inherent computational challenge has attracted the attention of researchers in the community. In this paper, we propose a new framework, called Trimmed Conformal Prediction (TCP), based on two stage procedure, a trimming step and a prediction step. Read More

We present a new methodology for simultaneous variable selection and parameter estimation in function-on-scalar regression with an ultra-high dimensional predictor vector. We extend the LASSO to functional data in both the $\textit{dense}$ functional setting and the $\textit{sparse}$ functional setting. We provide theoretical guarantees which allow for an exponential number of predictor variables. Read More

We develop tools for selective inference in the setting of group sparsity, including the construction of confidence intervals and p-values for testing selected groups of variables. Our main technical result gives the precise distribution of the magnitude of the projection of the data onto a given subspace, and enables us to develop inference procedures for a broad class of group-sparse selection methods, including the group lasso, iterative hard thresholding, and forward stepwise regression. We give numerical results to illustrate these tools on simulated data and on health record data. Read More

In multiple testing problems, where a large number of hypotheses are tested simultaneously, false discovery rate (FDR) control can be achieved with the well-known Benjamini-Hochberg procedure, which adapts to the amount of signal present in the data. Many modifications of this procedure have been proposed to improve power in scenarios where the hypotheses are organized into groups or into a hierarchy, as well as other structured settings. Here we introduce SABHA, the "structure-adaptive Benjamini-Hochberg algorithm", as a generalization of these adaptive testing methods. Read More

This paper develops a framework for testing for associations in a possibly high-dimensional linear model where the number of features/variables may far exceed the number of observational units. In this framework, the observations are split into two groups, where the first group is used to screen for a set of potentially relevant variables, whereas the second is used for inference over this reduced set of variables; we also develop strategies for leveraging information from the first part of the data at the inference step for greater accuracy. In our work, the inferential step is carried out by applying the recently introduced knockoff filter, which creates a knockoff copy-a fake variable serving as a control-for each screened variable. Read More

We propose the group knockoff filter, a method for false discovery rate control in a linear regression setting where the features are grouped, and we would like to select a set of relevant groups which have a nonzero effect on the response. By considering the set of true and false discoveries at the group level, this method gains power relative to sparse regression methods. We also apply our method to the multitask regression problem where multiple response variables share similar sparsity patterns across the set of possible features. Read More

In many practical applications of multiple hypothesis testing using the False Discovery Rate (FDR), the given hypotheses can be naturally partitioned into groups, and one may not only want to control the number of false discoveries (wrongly rejected null hypotheses), but also the number of falsely discovered groups of hypotheses (we say a group is falsely discovered if at least one hypothesis within that group is rejected, when in reality the group contains only nulls). In this paper, we introduce the p-filter, a procedure which unifies and generalizes the standard FDR procedure by Benjamini and Hochberg and global null testing procedure by Simes. We first prove that our proposed method can simultaneously control the overall FDR at the finest level (individual hypotheses treated separately) and the group FDR at coarser levels (when such groups are user-specified). Read More

We develop a primal-dual algorithm that allows for one-step inversion of spectral CT transmission photon counts data to a basis map decomposition. The algorithm allows for image constraints to be enforced on the basis maps during the inversion. The derivation of the algorithm makes use of a local upper bounding quadratic approximation to generate descent steps for non-convex spectral CT data discrepancy terms, combined with a new convex-concave optimization algorithm. Read More

Many optimization problems arising in high-dimensional statistics decompose naturally into a sum of several terms, where the individual terms are relatively simple but the composite objective function can only be optimized with iterative algorithms. In this paper, we are interested in optimization problems of the form F(Kx) + G(x), where K is a fixed linear transformation, while F and G are functions that may be nonconvex and/or nondifferentiable. In particular, if either of the terms are nonconvex, existing alternating minimization techniques may fail to converge; other types of existing approaches may instead be unable to handle nondifferentiability. Read More

Multiple testing problems arising in modern scientific applications can involve simultaneously testing thousands or even millions of hypotheses, with relatively few true signals. In this paper, we consider the multiple testing problem where prior information is available (for instance, from an earlier study under different experimental conditions), that can allow us to test the hypotheses as a ranked list in order to increase the number of discoveries. Given an ordered list of n hypotheses, the aim is to select a data-dependent cutoff k and declare the first k hypotheses to be statistically significant while bounding the false discovery rate (FDR). Read More

Consider the following three important problems in statistical inference, namely, constructing confidence intervals for (1) the error of a high-dimensional ($p>n$) regression estimator, (2) the linear regression noise level, and (3) the genetic signal-to-noise ratio of a continuous-valued trait (related to the heritability). All three problems turn out to be closely related to the little-studied problem of performing inference on the $\ell_2$-norm of the signal in high-dimensional linear regression. We derive a novel procedure for this, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well. Read More

We consider Bayesian variable selection in sparse high-dimensional regression, where the number of covariates $p$ may be large relative to the samples size $n$, but at most a moderate number $q$ of covariates are active. Specifically, we treat generalized linear models. For a single fixed sparse model with well-behaved prior distribution, classical theory proves that the Laplace approximation to the marginal likelihood of the model is accurate for sufficiently large sample size $n$. Read More

Undirected graphical models are used extensively in the biological and social sciences to encode a pattern of conditional independences between variables, where the absence of an edge between two nodes $a$ and $b$ indicates that the corresponding two variables $X_a$ and $X_b$ are believed to be conditionally independent, after controlling for all other measured variables. In the Gaussian case, conditional independence corresponds to a zero entry in the precision matrix $\Omega$ (the inverse of the covariance matrix $\Sigma$). Real data often exhibits heavy tail dependence between variables, which cannot be captured by the commonly-used Gaussian or nonparanormal (Gaussian copula) graphical models. Read More

We explore and compare a variety of definitions for privacy and disclosure limitation in statistical estimation and data analysis, including (approximate) differential privacy, testing-based definitions of privacy, and posterior guarantees on disclosure risk. We give equivalence results between the definitions, shedding light on the relationships between different formalisms for privacy. We also take an inferential perspective, where---building off of these definitions---we provide minimax risk bounds for several estimation problems, including mean estimation, estimation of the support of a distribution, and nonparametric density estimation. Read More

Sparse Gaussian graphical models characterize sparse dependence relationships between random variables in a network. To estimate multiple related Gaussian graphical models on the same set of variables, we formulate a hierarchical model, which leads to an optimization problem with a nonconvex log-shift penalty function. We show that under mild conditions the optimization problem is convex despite the inclusion of a nonconvex penalty, and derive an efficient optimization algorithm. Read More

In many fields of science, we observe a response variable together with a large number of potential explanatory variables, and would like to be able to discover which variables are truly associated with the response. At the same time, we need to know that the false discovery rate (FDR) - the expected fraction of false discoveries among all discoveries - is not too high, in order to assure the scientist that most of the discoveries are indeed true and replicable. This paper introduces the knockoff filter, a new variable selection procedure controlling the FDR in the statistical linear model whenever there are at least as many observations as variables. Read More

We consider the use of Bayesian information criteria for selection of the graph underlying an Ising model. In an Ising model, the full conditional distributions of each variable form logistic regression models, and variable selection techniques for regression allow one to identify the neighborhood of each node and, thus, the entire graph. We prove high-dimensional consistency results for this pseudo-likelihood approach to graph selection when using Bayesian information criteria for the variable selection problems in the logistic regressions. Read More