MEDL and MEDLA: Methods for Assessment of Scaling by Medians of Log-Squared Nondecimated Wavelet Coefficients

High-frequency measurements and images acquired from various sources in the real world often possess a degree of self-similarity and inherent regular scaling. When data look like a noise, the scaling exponent may be the only informative feature that summarizes such data. Methods for the assessment of self-similarity by estimating Hurst exponent often involve analysis of rate of decay in a spectrum defined in various multiresolution domains. When this spectrum is calculated using discrete non-decimated wavelet transforms, due to increased autocorrelation in wavelet coefficients, the estimators of $H$ show increased bias compared to the estimators that use traditional orthogonal transforms. At the same time, non-decimated transforms have a number of advantages when employed for calculation of wavelet spectra and estimation of Hurst exponents: the variance of the estimator is smaller, input signals and images could be of arbitrary size, and due to the shift-invariance, the local scaling can be assessed as well. We propose two methods based on robust estimation and resampling that alleviate the effect of increased autocorrelation while maintaining all advantages of non-decimated wavelet transforms. The proposed methods extend the approaches in existing literature where the logarithmic transformation and pairing of wavelet coefficients are used for lowering the bias. In a simulation study we use fractional Brownian motions with a range of theoretical Hurst exponents. For such signals for which "true" $H$ is known, we demonstrate bias reduction and overall reduction of the mean-squared error by the two proposed estimators. For fractional Brownian motions, both proposed methods yield estimators of $H$ that are asymptotically normal and unbiased.

Similar Publications

Continuous-time multi-state survival models can be used to describe health-related processes over time. In the presence of interval-censored times for transitions between the living states, the likelihood is constructed using transition probabilities. Models can be specified using parametric or semi-parametric shapes for the hazards. Read More

This paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the marginal residual terms are assumed uncorrelated and homoscedastic with possibly different standard deviations. The random effects covariance matrix is Cholesky factorized to directly estimate the variance components of these random effects. This strategy enables a consistent estimate of the random effects covariance matrix which, generally, has a poor estimate when it is grossly (or directly) estimated, using the estimating methods such as the EM algorithm. Read More

In this paper, we study a novel approach for the estimation of quantiles when facing potential right censoring of the responses. Contrary to the existing literature on the subject, the adopted strategy of this paper is to tackle censoring at the very level of the loss function usually employed for the computation of quantiles, the so-called "check" function. For interpretation purposes, a simple comparison with the latter reveals how censoring is accounted for in the newly proposed loss function. Read More

Cross-validation is one of the most popular model selection methods in statistics and machine learning. Despite its wide applicability, traditional cross-validation methods tend to select overfitting models, unless the ratio between the training and testing sample sizes is much smaller than conventional choices. We argue that such an overfitting tendency of cross-validation is due to the ignorance of the uncertainty in the testing sample. Read More

Particle filters are a popular and flexible class of numerical algorithms to solve a large class of nonlinear filtering problems. However, standard particle filters with importance weights have been shown to require a sample size that increases exponentially with the dimension D of the state space in order to achieve a certain performance, which precludes their use in very high-dimensional filtering problems. Here, we focus on the dynamic aspect of this curse of dimensionality (COD) in continuous time filtering, which is caused by the degeneracy of importance weights over time. Read More

Energy statistics are estimators of the energy distance that depend on the distances between observations. The idea behind energy statistics is to consider a statistical potential energy that would parallel Newton's gravitational potential energy. This statistical potential energy is zero if and only if a certain null hypothesis relating two distributions holds true. Read More

Recent advances in bioinformatics have made high-throughput microbiome data widely available, and new statistical tools are required to maximize the information gained from these data. For example, analysis of high-dimensional microbiome data from designed experiments remains an open area in microbiome research. Contemporary analyses work on metrics that summarize collective properties of the microbiome, but such reductions preclude inference on the fine-scale effects of environmental stimuli on individual microbial taxa. Read More

In social and economic studies many of the collected variables are measured on a nominal scale, often with a large number of categories. The definition of categories is usually not unambiguous and different classification schemes using either a finer or a coarser grid are possible. Categorisation has an impact when such a variable is included as covariate in a regression model: a too fine grid will result in imprecise estimates of the corresponding effects, whereas with a too coarse grid important effects will be missed, resulting in biased effect estimates and poor predictive performance. Read More

Thermodynamic integration (TI) for computing marginal likelihoods is based on an inverse annealing path from the prior to the posterior distribution. In many cases, the resulting estimator suffers from high variability, which particularly stems from the prior regime. When comparing complex models with differences in a comparatively small number of parameters, intrinsic errors from sampling fluctuations may outweigh the differences in the log marginal likelihood estimates. Read More

Sufficient dimension reduction (SDR) is continuing an active research field nowadays for high dimensional data. It aims to estimate the central subspace (CS) without making distributional assumption. To overcome the large-$p$-small-$n$ problem we propose a new approach for SDR. Read More