The use of spatial information in entropy measures

The concept of entropy, firstly introduced in information theory, rapidly became popular in many applied sciences via Shannon's formula to measure the degree of heterogeneity among observations. A rather recent research field aims at accounting for space in entropy measures, as a generalization when the spatial location of occurrences ought to be accounted for. The main limit of these developments is that all indices are computed conditional on a chosen distance. This work follows and extends the route for including spatial components in entropy measures. Starting from the probabilistic properties of Shannon's entropy for categorical variables, it investigates the characteristics of the quantities known as residual entropy and mutual information, when space is included as a second dimension. This way, the proposal of entropy measures based on univariate distributions is extended to the consideration of bivariate distributions, in a setting where the probabilistic meaning of all components is well defined. As a direct consequence, a spatial entropy measure satisfying the additivity property is obtained, as global residual entropy is a sum of partial entropies based on different distance classes. Moreover, the quantity known as mutual information measures the information brought by the inclusion of space, and also has the property of additivity. A thorough comparative study illustrates the superiority of the proposed indices.

Comments: 33 pages, 13 figures

Similar Publications

Thermodynamic integration (TI) for computing marginal likelihoods is based on an inverse annealing path from the prior to the posterior distribution. In many cases, the resulting estimator suffers from high variability, which particularly stems from the prior regime. When comparing complex models with differences in a comparatively small number of parameters, intrinsic errors from sampling fluctuations may outweigh the differences in the log marginal likelihood estimates. Read More

Sufficient dimension reduction (SDR) is continuing an active research field nowadays for high dimensional data. It aims to estimate the central subspace (CS) without making distributional assumption. To overcome the large-$p$-small-$n$ problem we propose a new approach for SDR. Read More

It is generally accepted that all models are wrong -- the difficulty is determining which are useful. Here, a useful model is considered as one that is capable of combining data and expert knowledge, through an inversion or calibration process, to adequately characterize the uncertainty in predictions of interest. This paper derives conditions that specify which simplified models are useful and how they should be calibrated. Read More

Variational inference methods for latent variable statistical models have gained popularity because they are relatively fast, can handle large data sets, and have deterministic convergence guarantees. However, in practice it is unclear whether the fixed point identified by the variational inference algorithm is a local or a global optimum. Here, we propose a method for constructing iterative optimization algorithms for variational inference problems that are guaranteed to converge to the $\epsilon$-global variational lower bound on the log-likelihood. Read More

The problems of computational data processing involving regression, interpolation, reconstruction and imputation for multidimensional big datasets are becoming more important these days, because of the availability of data and their widely spread usage in business, technological, scientific and other applications. The existing methods often have limitations, which either do not allow, or make it difficult to accomplish many data processing tasks. The problems usually relate to algorithm accuracy, applicability, performance (computational and algorithmic), demands for computational resources, both in terms of power and memory, and difficulty working with high dimensions. Read More

Conditional density estimation (density regression) estimates the distribution of a response variable y conditional on covariates x. Utilizing a partition model framework, a conditional density estimation method is proposed using logistic Gaussian processes. The partition is created using a Voronoi tessellation and is learned from the data using a reversible jump Markov chain Monte Carlo algorithm. Read More

The popularity of online surveys has increased the prominence of sampling weights in claims of representativeness. Yet, much uncertainty remains regarding how these weights should be employed in the analysis of survey experiments: Should they be used or ignored? If they are used, which estimators are preferred? We offer practical advice, rooted in the Neyman-Rubin model, for researchers producing and working with survey experimental data. We examine simple, efficient estimators (Horvitz-Thompson, H\`ajek, "double-H\`ajek", and post-stratification) for analyzing these data, along with formulae for biases and variances. Read More

Many application domains such as ecology or genomics have to deal with multivariate non Gaussian observations. A typical example is the joint observation of the respective abundances of a set of species in a series of sites, aiming to understand the co-variations between these species. The Gaussian setting provides a canonical way to model such dependencies, but does not apply in general. Read More

We describe a way to construct hypothesis tests and confidence intervals after having used the Lasso for feature selection, allowing the regularization parameter to be chosen via an estimate of prediction error. Our estimate of prediction error is a slight variation on cross-validation. Using this variation, we are able to describe an appropriate selection event for choosing a parameter by cross-validation. Read More

The stochastic block model is widely used for detecting community structures in network data. How to test the goodness-of-fit of the model is one of the fundamental problems and has gained growing interests in recent years. In this paper, we propose a new goodness-of-fit test based on the maximum entry of the centered and re-scaled observed adjacency matrix for the stochastic block model in which the number of communities can be allowed to grow linearly with the number of nodes ignoring a logarithm factor. Read More