Sourabh Bhattacharya - Bayesian and Interdisciplinary Research Unit, Indian Statistical Institute

Sourabh Bhattacharya
Are you Sourabh Bhattacharya?

Claim your profile, edit publications, add additional information:

Contact Details

Sourabh Bhattacharya
Bayesian and Interdisciplinary Research Unit, Indian Statistical Institute

Pubs By Year

External Links

Pub Categories

Statistics - Methodology (13)
Statistics - Theory (12)
Mathematics - Statistics (12)
Statistics - Applications (9)
Mathematics - Optimization and Control (5)
Statistics - Computation (5)
Computer Science - Computer Science and Game Theory (2)
Computer Science - Information Theory (2)
Computer Science - Robotics (2)
Mathematics - Information Theory (2)
Cosmology and Nongalactic Astrophysics (1)
Computer Science - Cryptography and Security (1)
Solar and Stellar Astrophysics (1)
Computer Science - Computer Vision and Pattern Recognition (1)
Astrophysics of Galaxies (1)
Computer Science - Computational Geometry (1)

Publications Authored By Sourabh Bhattacharya

It is becoming increasingly clear that complex interactions among genes and environmental factors play crucial roles in triggering complex diseases. Thus, understanding such interactions is vital, which is possible only through statistical models that adequately account for such intricate, albeit unknown, dependence structures. Bhattacharya & Bhattacharya (2016b) attempt such modeling, relating finite mixtures composed of Dirichlet processes that represent unknown number of genetic sub-populations through a hierarchical matrix-normal structure that incorporates gene-gene interactions, and possible mutations, induced by environmental variables. Read More

In this article we derive the almost sure convergence theory of Bayes factor in the general set-up that includes even dependent data and misspecified models, as a simple application of a result of Shalizi (2009) to a well-known identity satisfied by the Bayes factor. Read More

In this paper, we investigate a pursuit-evasion game in which a mobile observer tries to track a target in an environment containing obstacles. We formulate the game as an optimal control problem with state inequality constraint in a simple environment. We show that for some initial conditions, there are two different regimes in the optimal strategy of the pursuer depending on whether the state-constraint is activated. Read More

This paper presents an algorithm to deploy a team of {\it free} guards equipped with omni-directional cameras for tracking a bounded speed intruder inside a simply-connected polygonal environment. The proposed algorithm partitions the environment into smaller polygons, and assigns a guard to each partition so that the intruder is visible to at least one guard at all times. Based on the concept of {\it dynamic zones} introduced in this paper, we propose event-triggered strategies for the guards to track the intruder. Read More

We investigate a variation of the art gallery problem in which a team of mobile guards tries to track an unpredictable intruder in a simply-connected polygonal environment. In this work, we use the deployment strategy for diagonal guards originally proposed in [1]. The guards are confined to move along the diagonals of a polygon and the intruder can move freely within the environment. Read More

Recently, Chandra and Bhattacharya (2016) proposed a novel and general Bayesian multiple comparison method such that the decision on any hypothesis depends upon the joint posterior probability of the hypotheses on which the current hypothesis is strongly dependent. Here we investigate the asymptotic properties of their methodology, establishing in particular rates of convergence to zero of several versions of Bayesian false discovery rate and Bayesian false non-discovery rate associated with the non-marginal approach, as the sample size tends to infinity. We also establish convergence properties of several other established multiple testing methods, each representing a class of methodologies, and show that the non-marginal method is as good as the existing ones in that the associated versions of Bayesian false non-discovery rate converge to zero at the same rate, compared to the other methods, when versions of Bayesian false discovery rate are asymptotically controlled. Read More

Circular time series has received relatively little attention in statistics and modeling complex circular time series using the state space approach is non-existent in the literature. In this article we introduce a flexible Bayesian nonparametric approach to state space modeling of observed circular time series where even the latent states are circular random variables. Crucially, we assume that the forms of both observational and evolutionary functions, both of which are circular in nature, are unknown and time-varying. Read More

Delattre et al. (2013) considered n independent stochastic differential equations (SDEs), where in each case the drift term is modeled by a random effect times a known function free of parameters. The distribution of the random effects are assumed to depend upon unknown parameters which are to be learned about. Read More

Delattre et al. (2013) considered a system of stochastic differential equations (SDEs) in a random effects set-up. Under the independent and identical (iid) situation, and assuming normal distribution of the random effects, they established weak consistency and asymptotic normality of the maximum likelihood estimators (MLEs) of the population parameters of the random effects. Read More

Cognizance of gene-environment interactions may help prevent or detain the onset of complex diseases like cardiovascular disease, cancer, type2 diabetes, autism or asthma by adjustments to lifestyle. In this regard, we extend the Bayesian semiparametric gene-gene interaction model of Bhattacharya & Bhattacharya (2015) to include the possibility of influencing gene-gene interactions by environmental variables and possible mutations caused by the environment. Our model accounts for the unknown number of genetic sub-populations via finite mixtures composed of Dirichlet processes, which are related to each other through a hierarchical matrix normal structure responsible for inducing gene-gene interactions and possible mutations in association with environmental variables. Read More

In the classical literature on infinite series there are various tests to determine if a given infinite series converges, diverges, or oscillates. But unfortunately, for very many infinite series all the existing tests can fail to provide definitive answers. In this article we propose a novel Bayesian theory for assessment of convergence properties of any given infinite series. Read More

In this article we investigate consistency and asymptotic normality of the maximum likelihood and the posterior distribution of the parameters in the context of state space stochastic differential equations (SDEs). We then extend our asymptotic theory to random effects models based on systems of state space SDEs, covering both independent and identical and independent but non-identical collections of state space SDEs. We also address asymptotic inference in the case of multidimensional linear random effects, and in situations where the data are available in discretized forms. Read More

Research on asymptotic model selection in the context of stochastic differential equations (SDEs) is almost non-existent in the literature. In particular, when a collection of SDEs is considered, the problem of asymptotic model selection has not been hitherto investigated. Indeed, even though the diffusion coefficients may be considered known, questions on appropriate choice of the drift functions constitute a non-trivial model selection problem. Read More

The problem of model selection in the context of a system of stochastic differential equations (SDEs) has not been touched upon in the literature. Indeed, properties of Bayes factors have not been studied even in single SDE based model comparison problems. In this article, we first develop an asymptotic theory of Bayes factors when two SDEs are compared, assuming the time domain expands. Read More

Signal source seeking using autonomous vehicles is a complex problem. The complexity increases manifold when signal intensities captured by physical sensors onboard are noisy and unreliable. Added to the fact that signal strength decays with distance, noisy environments make it extremely difficult to describe and model a decay function. Read More

Gene-gene interactions are often regarded as playing significant roles in influencing variabilities of complex traits. Although much research has been devoted to this area, to date a comprehensive statistical model that adequately addresses the highly dependent structures associated with the interactions between the genes, multiple loci of every gene, various and unknown number of sub-populations that the subjects arise from, seem to be lacking. In this paper, we propose and develop a novel Bayesian semiparametric approach composed of finite mixtures based on Dirichlet processes and a hierarchical matrix-normal distribution that can comprehensively account for the unknown number of sub-populations and gene-gene interactions. Read More

Random Walk Metropolis Hastings (RWMH) algorithm, is quite inefficient in high dimensions because of its abysmally slow acceptance rate. The slow acceptance rate results from the fact that RWMH separately updates each coordinate of the chain at every step. Dutta and Bhattacharya (2013) proposed a new technique called Transformation based Markov Chain Monte Carlo (TMCMC) aimed at overcoming these problems. Read More

State space models are well-known for their versatility in modeling dynamic systems that arise in various scientific disciplines. Although parametric state space models are well studied, nonparametric approaches are much less explored in comparison. In this article we propose a novel Bayesian nonparametric approach to state space modeling assuming that both the observational and evolutionary functions are unknown and are varying with time; crucially, we assume that the unknown evolutionary equation describes dynamic evolution of some latent circular random variable. Read More

Delattre et al. (2013) considered n independent stochastic differential equations (SDEs), where in each case the drift term is associated with a random effect, the distribution of which depends upon unknown parameters. Assuming the independent and identical (iid) situation the authors provide independent proofs of weak consistency and asymptotic normality of the maximum likelihood estimators (MLEs) of the hyper-parameters of their random effects parameters. Read More

Delattre et al. (2013) investigated asymptotic properties of the maximum likelihood estimator of the population parameters of the random effects associated with n independent stochastic differential equations (SDEs) assuming that the SDEs are independent and identical (iid). In this article, we consider the Bayesian approach to learning about the population parameters, and prove consistency and asymptotic normality of the corresponding posterior distribution in the iid set-up as well as when the SDEs are independent but non-identical. Read More

The erroneous assumption "for all distributions for which the theoretical variance can be computed independently from parameters estimated by any method different from the method of moments" has been used in the case of fitting the gamma distribution to a rainfall data by Mooley (1973) which was followed by several researchers. We show that the asymptotic distribution of the test statistic is generally not even comparable to any central chi-square distribution. We also describe a method for checking the validity of the asymptotic distribution for a class of distributions. Read More

Discrete time spatial time series data arise routinely in meteorological and environmental studies. Inference and prediction associated with them are mostly carried out using any of the several variants of the linear state space model that are collectively called linear dynamic spatio-temporal models (LDSTMs). However, real world environmental processes are highly complex and are seldom representable by models with such simple linear structure. Read More

In this paper, using kernel convolution of order based dependent Dirichlet process (Griffin & Steel (2006)) we construct a nonstationary, nonseparable, nonparametric space-time process, which, as we show, satisfies desirable properties, and includes the stationary, separable, parametric processes as special cases. We also investigate the smoothness properties of our proposed model. Since our model entails an infinite random series, for Bayesian model fitting purpose we must either truncate the series or more appropriately consider a random number of summands, which renders the model dimension a random variable. Read More

Very recently, Transformation based Markov Chain Monte Carlo (TMCMC) was proposed by Dutta and Bhattcharya (2013) as a much efficient alternative to the Metropolis-Hastings algorithm, Random Walk Metropolis (RWM) algorithm, especially in high dimensions. The main advantage of this algorithm is that it simultaneously updates all components of a high dimensional parameter by some appropriate deterministic transformation of a single random variable, thereby reducing time complexity and enhancing the acceptance rate. The optimal scaling of the additive TMCMC approach has already been studied for the Gaussian proposal density by Dey and Bhattacharya(2013). Read More

In this article, we propose a novel and general dimension-hopping MCMC methodology that can update all the parameters as well as the number of parameters simultaneously using simple deterministic transformations of some low-dimensional (often one-dimensional) random variable. This methodology, which has been inspired by the recent Transformation based MCMC (TMCMC) for updating all the parameters simultaneously in general fixed-dimensional set-ups using low-dimensional random variables, facilitates great speed in terms of computation time and provides high acceptance rates, thanks to the low-dimensional random variables which effectively reduce the dimension dramatically. Quite importantly, our transformation based approach provides a natural way to automate the move-types in the variable dimensional problems. Read More

We provide two novel adaptive-rate compressive sensing (CS) strategies for sparse, time-varying signals using side information. Our first method utilizes extra cross-validation measurements, and the second one exploits extra low-resolution measurements. Unlike the majority of current CS techniques, we do not assume that we know an upper bound on the number of significant coefficients that comprise the images in the video sequence. Read More

Recently Dutta and Bhattacharya (2013) introduced a novel Markov Chain Monte Carlo methodology that can simultaneously update all the components of high dimensional parameters using simple deterministic transformations of a one-dimensional random variable drawn from any arbitrary distribution defined on a relevant support. The methodology, which the authors refer to as Transformation-based Markov Chain Monte Carlo (TMCMC), greatly enhances computational speed and acceptance rate in high-dimensional problems. Two significant transformations associated with TMCMC are additive and multiplicative transformations. Read More

Pamminger and Fruwirth-Schnatter (2010) considered a Bayesian approach to model-based clustering of categorical time series assuming a fixed number of clusters. But the popular methods for selecting the number of clusters, for example, the Bayes Information Criterion (BIC), turned out to have severe problems in the categorical time series context. In this paper, we circumvent the difficulties of choosing the number of clusters by adopting the Bayesian semiparametric mixture model approach introduced by Bhattacharya (2008), who assume that the number of clusters is a random quantity, but is bounded above by a (possibly large) number of clusters. Read More

Fossil-based palaeoclimate reconstruction is an important area of ecological science that has gained momentum in the backdrop of the global climate change debate. The hierarchical Bayesian paradigm provides an interesting platform for studying such important scientific issue. However, our cross-validation based assessment of the existing Bayesian hierarchical models with respect to two modern proxy data sets based on chironomid and pollen, respectively, revealed that the models are inadequate for the data sets. Read More

In the multiple testing literature, either Bayesian or non-Bayesian, the decision rules are usually functions of the marginal probabilities of the corresponding individual hypotheses. However, in realistic situations, the hypotheses are usually dependent, and hence it is desirable that the decisions regarding the dependent hypotheses are taken jointly. In this article we develop a novel Bayesian multiple testing procedure that coherently takes this requirement into consideration. Read More

We propose a method to estimate the location of the Sun in the disk of the Milky Way using a method based on the Hellinger distance and construct confidence sets on our estimate of the unknown location using a bootstrap based method. Assuming the Galactic disk to be two-dimensional, the sought solar location then reduces to the radial distance separating the Sun from the Galactic center and the angular separation of the Galactic center to Sun line, from a pre-fixed line on the disk. On astronomical scales, the unknown solar location is equivalent to the location of us earthlings who observe the velocities of a sample of stars in the neighborhood of the Sun. Read More

We consider the recently introduced Transformation-based Markov Chain Monte Carlo (TMCMC) (Dutta and Bhattacharya (2014)), a methodology that is designed to update all the parameters simultaneously using some simple deterministic transformation of a onedimensional random variable drawn from some arbitrary distribution on a relevant support. The additive transformation based TMCMC is similar in spirit to random walk Metropolis, except the fact that unlike the latter, additive TMCMC uses a single draw from a onedimensional proposal distribution to update the high-dimensional parameter. In this paper, we first provide a brief tutorial on TMCMC, exploring its connections and contrasts with various available MCMC methods. Read More

This is a supplement to the article "Markov Chain Monte Carlo Based on Deterministic Transformations" available at Read More

In this paper we develop an inverse Bayesian approach to find the value of the unknown model parameter vector that supports the real (or test) data, where the data comprises measurements of a matrix-variate variable. The method is illustrated via the estimation of the unknown Milky Way feature parameter vector, using available test and simulated (training) stellar velocity data matrices. The data is represented as an unknown function of the model parameters, where this high-dimensional function is modelled using a high-dimensional Gaussian Process (${\cal GP}$). Read More

Mixture models are well-known for their versatility, and the Bayesian paradigm is a suitable platform for mixture analysis, particularly when the number of components is unknown. Bhattacharya (2008) introduced a mixture model based on the Dirichlet process, where an upper bound on the unknown number of components is to be specified. Here we consider a Bayesian asymptotic framework for objectively specifying the upper bound, which we assume to depend on the sample size. Read More

We consider the problem of assessing goodness of fit of a single Bayesian model to the observed data in the inverse problem context. A novel procedure of goodness of fit test is proposed, based on construction of reference distributions using the `inverse' part of the given model. This is motivated by an example from palaeoclimatology in which it is of interest to reconstruct past climates using information obtained from fossils deposited in lake sediment. Read More

Landscape classification of the well-known biodiversity hotspot, Western Ghats (mountains), on the west coast of India, is an important part of a world-wide program of monitoring biodiversity. To this end, a massive vegetation data set, consisting of 51,834 4-variate observations has been clustered into different landscapes by Nagendra and Gadgil [Current Sci. 75 (1998) 264--271]. Read More

In this paper, we address the problem of stabilization in continuous time linear dynamical systems using state feedback when compressive sampling techniques are used for state measurement and reconstruction. In [5], we had introduced the concept of using l1 reconstruction technique, commonly used in sparse data reconstruction, for state measurement and estimation in a discrete time linear system. In this work, we extend the previous scenario to analyse continuous time linear systems. Read More

We introduce state-space models where the functionals of the observational and the evolutionary equations are unknown, and treated as random functions evolving with time. Thus, our model is nonparametric and generalizes the traditional parametric state-space models. This random function approach also frees us from the restrictive assumption that the functional forms, although time-dependent, are of fixed forms. Read More

Effective connectivity analysis provides an understanding of the functional organization of the brain by studying how activated regions influence one other. We propose a nonparametric Bayesian approach to model effective connectivity assuming a dynamic nonstationary neuronal system. Our approach uses the Dirichlet process to specify an appropriate (most plausible according to our prior beliefs) dynamic model as the "expectation" of a set of plausible models upon which we assign a probability distribution. Read More

In this article we propose a novel MCMC method based on deterministic transformations T: X x D --> X where X is the state-space and D is some set which may or may not be a subset of X. We refer to our new methodology as Transformation-based Markov chain Monte Carlo (TMCMC). One of the remarkable advantages of our proposal is that even if the underlying target distribution is very high-dimensional, deterministic transformation of a one-dimensional random variable is sufficient to generate an appropriate Markov chain that is guaranteed to converge to the high-dimensional target distribution. Read More

Affiliations: 1Bayesian and Interdisciplinary Research Unit, Indian Statistical Institute, 2Bayesian and Interdisciplinary Research Unit, Indian Statistical Institute

We propose and develop a novel and effective perfect sampling methodology for simulating from posteriors corresponding to mixtures with either known (fixed) or unknown number of components. For the latter we consider the Dirichlet process-based mixture model developed by these authors, and show that our ideas are applicable to conjugate, and importantly, to non-conjugate cases. As to be expected, and, as we show, perfect sampling for mixtures with known number of components can be achieved with much less effort with a simplified version of our general methodology, whether or not conjugate or non-conjugate priors are used. Read More

In this work, we study the problem of power allocation and adaptive modulation in teams of decision makers. We consider the special case of two teams with each team consisting of two mobile agents. Agents belonging to the same team communicate over wireless ad hoc networks, and they try to split their available power between the tasks of communication and jamming the nodes of the other team. Read More

In this work, we study the problem of power allocation in teams. Each team consists of two agents who try to split their available power between the tasks of communication and jamming the nodes of the other team. The agents have constraints on their total energy and instantaneous power usage. Read More

Recent technological advances have led to a flood of new data on cosmology rich in information about the formation and evolution of the universe, e.g., the data collected in Sloan Digital Sky Survey (SDSS) for more than 200 million objects. Read More