Physics - Data Analysis; Statistics and Probability Publications (50)


Physics - Data Analysis; Statistics and Probability Publications

Microscopy is the workhorse of the physical and life sciences, producing crisp images of everything from atoms to cells well beyond the capabilities of the human eye. However, the analysis of these images is frequently little better than automated manual marking. Here, we revolutionize the analysis of microscopy images, extracting all the information theoretically contained in a complex microscope image. Read More

We derive formulas for the efficiency correction of cumulants with many efficiency bins. The derivation of the formulas is simpler than the previously suggested method, but the numerical cost is drastically reduced from the naive method. From analytical and numerical analyses in simple toy models, we show that the use of the averaged efficiency in the efficiency correction can lead to wrong corrected values, which have larger deviation for higher order cumulants. Read More

The GooFit Framework is designed to perform maximum-likelihood fits for arbitrary functions on various parallel back ends, for example a GPU. We present an extension to GooFit which adds the functionality to perform time-dependent amplitude analyses of pseudoscalar mesons decaying into four pseudoscalar final states. Benchmarks of this functionality show a significant performance increase when utilizing a GPU compared to a CPU. Read More

As HPC facilities grow their resources, adaptation of classic HEP/NP workflows becomes a need. Linux containers may very well offer a way to lower the bar to exploiting such resources and at the time, help collaboration to reach vast elastic resources on such facilities and address their massive current and future data processing challenges. In this proceeding, we showcase STAR data reconstruction workflow at Cori HPC system at NERSC. Read More

Given a network, the statistical ensemble of its graph-Voronoi diagrams with randomly chosen cell centers exhibits properties convertible into information on the network's large scale structures. We define a node-pair level measure called {\it Voronoi cohesion} which describes the probability for sharing the same Voronoi cell, when randomly choosing $g$ centers in the network. This measure provides information based on the global context (the network in its entirety) a type of information that is not carried by other similarity measures. Read More

Limits on power dissipation have pushed CPUs to grow in parallel processing capabilities rather than clock rate, leading to the rise of "manycore" or GPU-like processors. In order to achieve the best performance, applications must be able to take full advantage of vector units across multiple cores, or some analogous arrangement on an accelerator card. Such parallel performance is becoming a critical requirement for methods to reconstruct the tracks of charged particles at the Large Hadron Collider and, in the future, at the High Luminosity LHC. Read More

Consider the problem of modeling hysteresis for finite-state random walks using higher-order Markov chains. This Letter introduces a Bayesian framework to determine, from data, the number of prior states of recent history upon which a trajectory is statistically dependent. The general recommendation is to use leave-one-out cross validation, using an easily-computable formula that is provided in closed form. Read More

Various hypotheses exist about the paths used for communication between the nodes of complex networks. Most studies simply suppose that communication goes via shortest paths, while others have more explicit assumptions about how routing (alternatively navigation or search) works or should work in real networks. However, these assumptions are rarely checked against real data. Read More

Restricted Boltzmann Machines are described by the Gibbs measure of a bipartite spin glass, which in turn corresponds to the one of a generalised Hopfield network. This equivalence allows us to characterise the state of these systems in terms of retrieval capabilities, at both low and high load. We study the paramagnetic-spin glass and the spin glass-retrieval phase transitions, as the pattern (i. Read More

Electron ptychography has seen a recent surge of interest for phase sensitive imaging at atomic or near-atomic resolution. However, applications are so far mainly limited to radiation-hard samples because the required doses are too high for imaging biological samples at high resolution. We propose the use of non-convex, Bayesian optimization to overcome this problem and reduce the dose required for successful reconstruction by two orders of magnitude compared to previous experiments. Read More

The quantitative evaluation of combustion models against experimental data remains a main challenge. This is a consequence of the data complexity, often involving velocity, temperature, and chemical composition; the data acquisition, consisting of intrusive, non-intrusive, direct, and inferred measurement methods; and the data preparation in the form of instantaneous scatter data, statistical results, or conditional information. By addressing this issue, the Wasserstein metric is introduced as a probabilistic measure to enable quantitative evaluations of LES combustion models. Read More

In this paper, a Bayesian method for piecewise regression is adapted to handle counting processes data distributed as Poisson. A numerical code in Mathematica is developed and tested analyzing simulated data. The resulting method is valuable for detecting breaking points in the count rate of time series for Poisson processes. Read More

Background: Measures of spike train synchrony are widely used in both experimental and computational neuroscience. Time-scale independent and parameter-free measures, such as the ISI-distance, the SPIKE-distance and SPIKE-synchronization, are preferable to time-scale parametric measures, since by adapting to the local firing rate they take into account all the time-scales of a given dataset. New Method: In data containing multiple time-scales (e. Read More

We introduce a nonparametric approach for estimating drift and diffusion functions in systems of stochastic differential equations from observations of the state vector. Gaussian processes are used as flexible models for these functions and estimates are calculated directly from dense data sets using Gaussian process regression. We also develop an approximate expectation maximization algorithm to deal with the unobserved, latent dynamics between sparse observations. Read More

Methods for detecting structural changes, or change points, in time series data are widely used in many fields of science and engineering. This chapter sketches some basic methods for the analysis of structural changes in time series data. The exposition is confined to retrospective methods for univariate time series. Read More

Hydro-meteorological variables, like precipitation, streamflow are significantly influenced by various climatic factors and large-scale atmospheric circulation patterns. Efficient water resources management requires an understanding of the effects of climate indices on the accurate predictability of precipitation. This study aims at understanding the standalone teleconnection between precipitation across India and the four climate indices, namely, Ni\~no 3. Read More

We review the concept of Support Vector Machines (SVMs) and discuss examples of their use in a number of scenarios. Several SVM implementations have been used in HEP and we exemplify this algorithm using the Toolkit for Multivariate Analysis (TMVA) implementation. We discuss examples relevant to HEP including background suppression for $H\to\tau^+\tau^-$ at the LHC with several different kernel functions. Read More

This work aimed, to determine the characteristics of activity series from fractal geometry concepts application, in addition to evaluate the possibility of identifying individuals with fibromyalgia. Activity level data were collected from 27 healthy subjects and 27 fibromyalgia patients, with the use of clock-like devices equipped with accelerometers, for about four weeks, all day long. The activity series were evaluated through fractal and multifractal methods. Read More

We present the first world-wide inter-laboratory comparison of small-angle X-ray scattering (SAXS) for nanoparticle sizing. The measurands in this comparison are the mean particle radius, the width of the size distribution and the particle concentration. The investigated sample consists of dispersed silver nanoparticles, surrounded by a stabilizing polymeric shell of poly(acrylic acid). Read More

In this study, we propose a mechanical model of a plurality election based on a mean field voter model. We assume that there are three candidates in each electoral district, i.e. Read More

Authors: M. Ablikim, M. N. Achasov, S. Ahmed, X. C. Ai, O. Albayrak, M. Albrecht, D. J. Ambrose, A. Amoroso, F. F. An, Q. An, J. Z. Bai, O. Bakina, R. Baldini Ferroli, Y. Ban, D. W. Bennett, J. V. Bennett, N. Berger, M. Bertani, D. Bettoni, J. M. Bian, F. Bianchi, E. Boger, I. Boyko, R. A. Briere, H. Cai, X. Cai, O. Cakir, A. Calcaterra, G. F. Cao, S. A. Cetin, J. Chai, J. F. Chang, G. Chelkov, G. Chen, H. S. Chen, J. C. Chen, M. L. Chen, S. Chen, S. J. Chen, X. Chen, X. R. Chen, Y. B. Chen, X. K. Chu, G. Cibinetto, H. L. Dai, J. P. Dai, A. Dbeyssi, D. Dedovich, Z. Y. Deng, A. Denig, I. Denysenko, M. Destefanis, F. DeMori, Y. Ding, C. Dong, J. Dong, L. Y. Dong, M. Y. Dong, Z. L. Dou, S. X. Du, P. F. Duan, J. Z. Fan, J. Fang, S. S. Fang, X. Fang, Y. Fang, R. Farinelli, L. Fava, F. Feldbauer, G. Felici, C. Q. Feng, E. Fioravanti, M. Fritsch, C. D. Fu, Q. Gao, X. L. Gao, Y. Gao, Z. Gao, I. Garzia, K. Goetzen, L. Gong, W. X. Gong, W. Gradl, M. Greco, M. H. Gu, Y. T. Gu, Y. H. Guan, A. Q. Guo, L. B. Guo, R. P. Guo, Y. Guo, Y. P. Guo, Z. Haddadi, A. Hafner, S. Han, X. Q. Hao, F. A. Harris, K. L. He, F. H. Heinsius, T. Held, Y. K. Heng, T. Holtmann, Z. L. Hou, C. Hu, H. M. Hu, J. F. Hu, T. Hu, Y. Hu, G. S. Huang, J. S. Huang, X. T. Huang, X. Z. Huang, Z. L. Huang, T. Hussain, W. Ikegami Andersson, Q. Ji, Q. P. Ji, X. B. Ji, X. L. Ji, L. W. Jiang, X. S. Jiang, X. Y. Jiang, J. B. Jiao, Z. Jiao, D. P. Jin, S. Jin, T. Johansson, A. Julin, N. Kalantar-Nayestanaki, X. L. Kang, X. S. Kang, M. Kavatsyuk, B. C. Ke, P. Kiese, R. Kliemt, B. Kloss, O. B. Kolcu, B. Kopf, M. Kornicer, A. Kupsc, W. Kuhn, J. S. Lange, M. Lara, P. Larin, H. Leithoff, C. Leng, C. Li, Cheng Li, D. M. Li, F. Li, F. Y. Li, G. Li, H. B. Li, H. J. Li, J. C. Li, Jin Li, K. Li, K. Li, Lei Li, P. R. Li, Q. Y. Li, T. Li, W. D. Li, W. G. Li, X. L. Li, X. N. Li, X. Q. Li, Y. B. Li, Z. B. Li, H. Liang, Y. F. Liang, Y. T. Liang, G. R. Liao, D. X. Lin, B. Liu, B. J. Liu, C. X. Liu, D. Liu, F. H. Liu, Fang Liu, Feng Liu, H. B. Liu, H. H. Liu, H. H. Liu, H. M. Liu, J. Liu, J. B. Liu, J. P. Liu, J. Y. Liu, K. Liu, K. Y. Liu, L. D. Liu, P. L. Liu, Q. Liu, S. B. Liu, X. Liu, Y. B. Liu, Y. Y. Liu, Z. A. Liu, Zhiqing Liu, H. Loehner, X. C. Lou, H. J. Lu, J. G. Lu, Y. Lu, Y. P. Lu, C. L. Luo, M. X. Luo, T. Luo, X. L. Luo, X. R. Lyu, F. C. Ma, H. L. Ma, L. L. Ma, M. M. Ma, Q. M. Ma, T. Ma, X. N. Ma, X. Y. Ma, Y. M. Ma, F. E. Maas, M. Maggiora, Q. A. Malik, Y. J. Mao, Z. P. Mao, S. Marcello, J. G. Messchendorp, G. Mezzadri, J. Min, T. J. Min, R. E. Mitchell, X. H. Mo, Y. J. Mo, C. Morales Morales, N. Yu. Muchnoi, H. Muramatsu, P. Musiol, Y. Nefedov, F. Nerling, I. B. Nikolaev, Z. Ning, S. Nisar, S. L. Niu, X. Y. Niu, S. L. Olsen, Q. Ouyang, S. Pacetti, Y. Pan, P. Patteri, M. Pelizaeus, H. P. Peng, K. Peters, J. Pettersson, J. L. Ping, R. G. Ping, R. Poling, V. Prasad, H. R. Qi, M. Qi, S. Qian, C. F. Qiao, L. Q. Qin, N. Qin, X. S. Qin, Z. H. Qin, J. F. Qiu, K. H. Rashid, C. F. Redmer, M. Ripka, G. Rong, Ch. Rosner, X. D. Ruan, A. Sarantsev, M. Savrie, C. Schnier, K. Schoenning, W. Shan, M. Shao, C. P. Shen, P. X. Shen, X. Y. Shen, H. Y. Sheng, W. M. Song, X. Y. Song, S. Sosio, S. Spataro, G. X. Sun, J. F. Sun, S. S. Sun, X. H. Sun, Y. J. Sun, Y. Z. Sun, Z. J. Sun, Z. T. Sun, C. J. Tang, X. Tang, I. Tapan, E. H. Thorndike, M. Tiemens, I. Uman, G. S. Varner, B. Wang, B. L. Wang, D. Wang, D. Y. Wang, K. Wang, L. L. Wang, L. S. Wang, M. Wang, P. Wang, P. L. Wang, W. Wang, W. P. Wang, X. F. Wang, Y. Wang, Y. D. Wang, Y. F. Wang, Y. Q. Wang, Z. Wang, Z. G. Wang, Z. H. Wang, Z. Y. Wang, Z. Y. Wang, T. Weber, D. H. Wei, P. Weidenkaff, S. P. Wen, U. Wiedner, M. Wolke, L. H. Wu, L. J. Wu, Z. Wu, L. Xia, L. G. Xia, Y. Xia, D. Xiao, H. Xiao, Z. J. Xiao, Y. G. Xie, Y. H. Xie, Q. L. Xiu, G. F. Xu, J. J. Xu, L. Xu, Q. J. Xu, Q. N. Xu, X. P. Xu, L. Yan, W. B. Yan, W. C. Yan, Y. H. Yan, H. J. Yang, H. X. Yang, L. Yang, Y. X. Yang, M. Ye, M. H. Ye, J. H. Yin, Z. Y. You, B. X. Yu, C. X. Yu, J. S. Yu, C. Z. Yuan, Y. Yuan, A. Yuncu, A. A. Zafar, Y. Zeng, Z. Zeng, B. X. Zhang, B. Y. Zhang, C. C. Zhang, D. H. Zhang, H. H. Zhang, H. Y. Zhang, J. Zhang, J. J. Zhang, J. L. Zhang, J. Q. Zhang, J. W. Zhang, J. Y. Zhang, J. Z. Zhang, K. Zhang, L. Zhang, S. Q. Zhang, X. Y. Zhang, Y. Zhang, Y. Zhang, Y. H. Zhang, Y. N. Zhang, Y. T. Zhang, Yu Zhang, Z. H. Zhang, Z. P. Zhang, Z. Y. Zhang, G. Zhao, J. W. Zhao, J. Y. Zhao, J. Z. Zhao, Lei Zhao, Ling Zhao, M. G. Zhao, Q. Zhao, Q. W. Zhao, S. J. Zhao, T. C. Zhao, Y. B. Zhao, Z. G. Zhao, A. Zhemchugov, B. Zheng, J. P. Zheng, W. J. Zheng, Y. H. Zheng, B. Zhong, L. Zhou, X. Zhou, X. K. Zhou, X. R. Zhou, X. Y. Zhou, K. Zhu, K. J. Zhu, S. Zhu, S. H. Zhu, X. L. Zhu, Y. C. Zhu, Y. S. Zhu, Z. A. Zhu, J. Zhuang, L. Zotti, B. S. Zou, J. H. Zou

By analyzing the large-angle Bhabha scattering events $e^{+}e^{-}$ $\to$ ($\gamma$)$e^{+}e^{-}$ and diphoton events $e^{+}e^{-}$ $\to$ $\gamma\gamma$ for the data sets collected at center-of-mass (c.m.) energies between 2. Read More

The distribution of the geometric distances of connected neurons is a practical factor underlying neural networks in the brain. It can affect the brain\'s dynamic properties at the ground level. Karbowski derived a power-law decay distribution that has not yet been verified by experiment. Read More

Many observational records critically rely on our ability to merge different (and not necessarily overlapping) observations into a single composite. We provide a novel and fully-traceable approach for doing so, which relies on a multi-scale maximum likelihood estimator. This approach overcomes the problem of data gaps in a natural way and uses data-driven estimates of the uncertainties. Read More

We investigate both analytically and by numerical simulation the relaxation of an overdamped Brownian particle in a 1D multiwell potential. We show that the mean relaxation time from an injection point inside the well down to its bottom is dominated by statistically rare trajectories that sample the potential profile outside the well. As a consequence, also the hopping time between two degenerate wells can depend on the detailed multiwell structure of the entire potential. Read More

There is a need for affordable, widely deployable maternal-fetal ECG monitors to improve maternal and fetal health during pregnancy and delivery. Based on the diffusion-based channel selection, here we present the mathematical formalism and clinical validation of an algorithm capable of accurate separation of maternal and fetal ECG from a two channel signal acquired over maternal abdomen. Read More

We explore the effect of noise on the ballistic graphene-based small Josephson junctions in the framework of the resistively and capacitively shunted model. We use the non-sinusoidal current-phase relation specific for graphene layers partially covered by superconducting electrodes. The noise induced escapes from the metastable states, when the external bias current is ramped, give the switching current distribution, i. Read More

Gravitational Sound clips produced by the Laser Interferometer Gravitational-Wave Observatory (LIGO) and the Massachusetts Institute of Technology (MIT) are considered within the particular context of data reduction. It is shown that these types of signals can be approximated at high quality using much less elementary components than those required within the standard orthogonal basis framework. Furthermore, a measure a local sparsity is shown to render meaningful information about the variation of a signal along time, by generating a set of local sparsity values which is much smaller than the dimension of the signal. Read More

The soundscape in the eastern Arctic was studied from April to September 2013 using a 22 element vertical hydrophone array as it drifted from near the North Pole (89$^{\circ}$23'N, 62$^{\circ}$35'W) to north of Fram Strait (83$^{\circ}$45'N 4$^{\circ}$28'W). The hydrophones recorded for 108 minutes on six days per week with a sampling rate of 1953.125 Hz. Read More

Computational models in chemistry rely on a number of approximations. The effect of such approximations on observables derived from them is often unpredictable. Therefore, it is challenging to quantify the uncertainty of a computational result, which, however, is necessary to assess the suitability of a computational model. Read More

We investigate scaling properties of human brain functional networks in the resting-state. Analyzing network degree distributions, we statistically test whether their tails scale as power-law or not. Initial studies, based on least-squares fitting, were shown to be inadequate for precise estimation of power-law distributions. Read More

Recent progress in applying machine learning for jet physics has been built upon an analogy between calorimeters and images. In this work, we present a novel class of recursive neural networks built instead upon an analogy between QCD and natural languages. In the analogy, four-momenta are like words and the clustering history of sequential recombination jet algorithms is like the parsing of a sentence. Read More

We review recent advances on the record statistics of strongly correlated time series, whose entries denote the positions of a random walk or a L\'evy flight on a line. After a brief survey of the theory of records for independent and identically distributed random variables, we focus on random walks. During the last few years, it was indeed realized that random walks are a very useful "laboratory" to test the effects of correlations on the record statistics. Read More

We study the efficacy of learning neural networks with neural networks by the (stochastic) gradient descent method. While gradient descent enjoys empirical success in a variety of applications, there is a lack of theoretical guarantees that explains the practical utility of deep learning. We focus on two-layer neural networks with a linear activation on the output node. Read More

As machine learning algorithms become increasingly sophisticated to exploit subtle features of the data, they often become more dependent on simulations. This paper presents a new approach called weakly supervised classification in which class proportions are the only input into the machine learning algorithm. Using one of the most challenging binary classification tasks in high energy physics - quark versus gluon tagging - we show that weakly supervised classification can match the performance of fully supervised algorithms. Read More

Interactions in nature can be described by their coupling strength, direction of coupling and coupling function. The coupling strength and directionality are relatively well understood and studied, at least for two interacting systems, however there can be a complexity in the interactions uniquely dependent on the coupling functions. Such a special case is studied here { synchronization transition occurs only due to the time-variability of the coupling functions, while the net coupling strength is constant throughout the observation time. Read More

In many contexts it is extremely costly to perform enough high quality experimental measurements to accurately parameterize a predictive quantitative model. However, it is often much easier to carry out experiments that indicate whether a particular sample is above or below a given threshold. Can many such binary or "coarse" measurements be combined with a much smaller number of higher resolution or "fine" measurements to yield accurate models? Here, we demonstrate an intuitive strategy, inspired by statistical physics, wherein the coarse measurements identify the salient features of the data, while fine measurements determine the relative importance of these features. Read More

A scalar Langevin-type process $X(t)$ that is driven by Ornstein-Uhlenbeck noise $\eta(t)$ is non-Markovian. However, the joint dynamics of $X$ and $\eta$ is described by a Markov process in two dimensions. But even though there exists a variety of techniques for the analysis of Markov processes, it is still a challenge to estimate the process parameters solely based on a given time series of $X$. Read More

The Huang-Hilbert transform is applied to Seismic Electric Signal (SES) activities in order to decompose them into a number of Intrinsic Mode Functions (IMFs) and study which of these functions better represent the SES. The results are compared to those obtained from the analysis in a new time domain termed natural time after having subtracted the magnetotelluric background from the original signal. It is shown that the instantaneous amplitudes of the IMFs can be used for the distinction of SES from artificial noises when combined with the natural time analysis. Read More

Spatial phenomena are subject to scale effects, but there are rarely studies addressing such effects on spatially embedded contact networks. There are two types of structure in these networks, network structure and spatial structure. The network structure has been actively studied. Read More

This is a photographic dataset collected for testing image processing algorithms. The idea is to have images that can exploit the properties of total variation, therefore a set of playing cards was distributed on the scene. The dataset is made available at www. Read More

Community structure is an important structural property that extensively exists in various complex networks. In the past decade, much attention has been paid to the design of community-detection methods, but analyzing the behaviors of the methods is also of interest in the theoretical research and real applications. Here, we focus on an important measure for community structure, significance [Sci. Read More

An individual's social group may be represented by his ego-network, formed by the links between the individual and his acquaintances. Ego-networks present an internal structure of increasingly large nested layers of decreasing relationship intensity, whose size exhibits a precise scaling ratio. Starting from the notion of limited social bandwidth, and assuming fixed costs for the links in each layer, we propose two statistical models that generate the observed hierarchical social structure. Read More

The time domain technique for impedance spectroscopy consists in computing excitation voltage and current response Fourier images by fast or discrete Fourier transform and calculating their relation. Here we propose an alternative method for excitation voltage and current response processing for deriving system impedance spectrum based on fast and flexible adaptive filtering method. We show the equivalence between the problem of adaptive filter learning and deriving system impedance spectrum. Read More

Security-Constrained Unit Commitment (SCUC) is one of the most significant problems in secure and optimal operation of modern electricity markets. New sources of uncertainties such as wind speed volatility and price-sensitive loads impose additional challenges to this large-scale problem. This paper proposes a new Stochastic SCUC using point estimation method to model the power system uncertainties more efficiently. Read More

We provide a bridge between generative modeling in the Machine Learning community and simulated physical processes in High Energy Particle Physics by applying a novel Generative Adversarial Network (GAN) architecture to the production of jet images -- 2D representations of energy depositions from particles interacting with a calorimeter. We propose a simple architecture, the Location-Aware Generative Adversarial Network, that learns to produce realistic radiation patterns from simulated high energy particle collisions. The pixel intensities of GAN-generated images faithfully span over many orders of magnitude and exhibit the desired low-dimensional physical properties (i. Read More

Data processing pipelines are one of most common astronomical software. This kind of programs are chains of processes that transform raw data into valuable information. In this work a Python framework for astronomical pipeline generation is presented. Read More

Markov Decision Process (MDP) framework is adopted to represent ensemble control of devices, whose energy consumption pattern is of a cycling type, e.g. thermostatically controlled loads. Read More

Thermostatically Controlled Loads (TCL), e.g. air-conditioners and heaters, are by far the most wide-spread consumers of electricity. Read More

`Double edge swaps' transform one graph into another while preserving the graph's degree sequence, and have thus been used in a number of popular Markov chain Monte Carlo (MCMC) sampling techniques. However, while double edge-swap MCMC sampling can, for any fixed degree sequence, sample simple graphs, multigraphs, and pseudographs uniformly, this is not true for graphs which allow self-loops but not multiedges (loopy graphs). Indeed, we exactly characterize the degree sequences where double edge swaps cannot reach every valid loopy graph and develop an efficient algorithm to determine such degree sequences. Read More