R. Tripiccione - Univ. and INFN Ferrara

R. Tripiccione
Are you R. Tripiccione?

Claim your profile, edit publications, add additional information:

Contact Details

R. Tripiccione
Univ. and INFN Ferrara

Pubs By Year

Pub Categories

Physics - Disordered Systems and Neural Networks (16)
Nonlinear Sciences - Chaotic Dynamics (13)
High Energy Physics - Lattice (12)
Physics - Statistical Mechanics (9)
Physics - Computational Physics (5)
Computer Science - Architecture (5)
Physics - Fluid Dynamics (4)
Computer Science - Distributed; Parallel; and Cluster Computing (4)
Physics - Other (1)
General Relativity and Quantum Cosmology (1)
Physics - Instrumentation and Detectors (1)
Cosmology and Nongalactic Astrophysics (1)
Computer Science - Performance (1)
High Energy Physics - Phenomenology (1)
High Energy Astrophysical Phenomena (1)

Publications Authored By R. Tripiccione

High-performance computing systems are more and more often based on accelerators. Computing applications targeting those systems often follow a host-driven approach in which hosts offload almost all compute-intensive sections of the code onto accelerators; this approach only marginally exploits the computational resources available on the host CPUs, limiting performance and energy efficiency. The obvious step forward is to run compute-intensive kernels in a concurrent and balanced way on both hosts and accelerators. Read More

We present a systematic derivation of relativistic lattice kinetic equations for finite-mass particles, reaching close to the zero-mass ultra-relativistic regime treated in the previous literature. Starting from an expansion of the Maxwell-Juettner distribution on orthogonal polynomials, we perform a Gauss-type quadrature procedure and discretize the relativistic Boltzmann equation on space-filling Cartesian lattices. The model is validated through numerical comparison with standard benchmark tests and solvers in relativistic fluid dynamics such as Boltzmann approach multiparton scattering (BAMPS) and previous relativistic lattice Boltzmann models. Read More

Energy efficiency is becoming increasingly important for computing systems, in particular for large scale HPC facilities. In this work we evaluate, from an user perspective, the use of Dynamic Voltage and Frequency Scaling (DVFS) techniques, assisted by the power and energy monitoring capabilities of modern processors in order to tune applications for energy efficiency. We run selected kernels and a full HPC application on two high-end processors widely used in the HPC context, namely an NVIDIA K80 GPU and an Intel Haswell CPU. Read More

This paper describes a massively parallel code for a state-of-the art thermal lattice- Boltzmann method. Our code has been carefully optimized for performance on one GPU and to have a good scaling behavior extending to a large number of GPUs. Versions of this code have been already used for large-scale studies of convective turbulence. Read More

An increasingly large number of HPC systems rely on heterogeneous architectures combining traditional multi-core CPUs with power efficient accelerators. Designing efficient applications for these systems has been troublesome in the past as accelerators could usually be programmed using specific programming languages threatening maintainability, portability and correctness. Several new programming environments try to tackle this problem. Read More

The present panorama of HPC architectures is extremely heterogeneous, ranging from traditional multi-core CPU processors, supporting a wide class of applications but delivering moderate computing performance, to many-core GPUs, exploiting aggressive data-parallelism and delivering higher performances for streaming computing applications. In this scenario, code portability (and performance portability) become necessary for easy maintainability of applications; this is very relevant in scientific computing where code changes are very frequent, making it tedious and prone to error to keep different code versions aligned. In this work we present the design and optimization of a state-of-the-art production-level LQCD Monte Carlo application, using the directive-based OpenACC programming model. Read More

We study the turbulent evolution originated from a system subjected to a Rayleigh-Taylor instability with a double density at high resolution in a 2 dimensional geometry using a highly optimized thermal Lattice Boltzmann code for GPUs. The novelty of our investigation stems from the initial condition, given by the superposition of three layers with three different densities, leading to the development of two Rayleigh-Taylor fronts that expand upward and downward and collide in the middle of the cell. By using high resolution numerical data we highlight the effects induced by the collision of the two turbulent fronts in the long time asymptotic regime. Read More

We perform equilibrium parallel-tempering simulations of the 3D Ising Edwards-Anderson spin glass in a field. A traditional analysis shows no signs of a phase transition. Yet, we encounter dramatic fluctuations in the behaviour of the model: Averages over all the data only describe the behaviour of a small fraction of it. Read More

We present a new numerical Monte Carlo approach to determine the scaling behavior of lattice field theories far from equilibrium. The presented methods are generally applicable to systems where classical-statistical fluctuations dominate the dynamics. As an example, these methods are applied to the random-force-driven one-dimensional Burgers' equation - a model for hydrodynamic turbulence. Read More

We report a high-precision finite-size scaling study of the critical behavior of the three-dimensional Ising Edwards-Anderson model (the Ising spin glass). We have thermalized lattices up to L=40 using the Janus dedicated computer. Our analysis takes into account leading-order corrections to scaling. Read More

This paper describes the architecture, the development and the implementation of Janus II, a new generation application-driven number cruncher optimized for Monte Carlo simulations of spin systems (mainly spin glasses). This domain of computational physics is a recognized grand challenge of high-performance computing: the resources necessary to study in detail theoretical models that can make contact with experimental data are by far beyond those available using commodity computer systems. On the other hand, several specific features of the associated algorithms suggest that unconventional computer architectures, which can be implemented with available electronics technologies, may lead to order of magnitude increases in performance, reducing to acceptable values on human scales the time needed to carry out simulation campaigns that would take centuries on commercially available machines. Read More

We study the off-equilibrium dynamics of the three-dimensional Ising spin glass in the presence of an external magnetic field. We have performed simulations both at fixed temperature and with an annealing protocol. Thanks to the Janus special-purpose computer, based on FPGAs, we have been able to reach times equivalent to 0. Read More

We describe Janus, a massively parallel FPGA-based computer optimized for the simulation of spin glasses, theoretical models for the behavior of glassy materials. FPGAs (as compared to GPUs or many-core processors) provide a complementary approach to massively parallel computing. In particular, our model problem is formulated in terms of binary variables, and floating-point operations can be (almost) completely avoided. Read More

Spin glasses are a longstanding model for the sluggish dynamics that appears at the glass transition. However, spin glasses differ from structural glasses for a crucial feature: they enjoy a time reversal symmetry. This symmetry can be broken by applying an external magnetic field, but embarrassingly little is known about the critical behaviour of a spin glass in a field. Read More

The OPERA Collaboration reported evidence for muonic neutrinos traveling slightly faster than light in vacuum. While waiting further checks from the experimental community, here we aim at exploring some theoretical consequences of the hypothesis that muonic neutrinos are superluminal, considering in particular the tachyonic and the Coleman-Glashow cases. We show that a tachyonic interpretation is not only hardly reconciled with OPERA data on energy dependence, but that it clashes with neutrino production from pion and with neutrino oscillations. Read More

We study the sample-to-sample fluctuations of the overlap probability densities from large-scale equilibrium simulations of the three-dimensional Edwards-Anderson spin glass below the critical temperature. Ultrametricity, Stochastic Stability and Overlap Equivalence impose constraints on the moments of the overlap probability densities that can be tested against numerical data. We found small deviations from the Ghirlanda-Guerra predictions, which get smaller as system size increases. Read More

Reactive Rayleigh-Taylor systems are characterized by the competition between the growth of the instability and the rate of reaction between cold (heavy) and hot (light) phases. We present results from state-of-the-art numerical simulations performed at high resolution in 2d by means of a self-consistent lattice Boltzmann method which evolves the coupled momentum and thermal equations and includes a reactive term. We tune the parameters affecting flame properties, in order to address the competition between turbulent mixing and reaction, ranging from slow to fast-reaction rates. Read More

The parameterization of small-scale turbulent fluctuations in convective systems and in the presence of strong stratification is a key issue for many applied problems in oceanography, atmospheric science and planetology. In the presence of stratification, one needs to cope with bulk turbulent fluctuations and with inversion regions, where temperature, density -or both- develop highly non-linear mean profiles due to the interactions between the turbulent boundary layer and the unmixed -stable- flow above/below it. We present a second order closure able to cope simultaneously with both bulk and boundary layer regions, and we test it against high-resolution state-of-the-art 2D numerical simulations in a convective and stratified belt for values of the Rayleigh number, up to Ra = 10^9. Read More

We present results of a high resolution numerical study of two dimensional (2d) Rayleigh-Taylor turbulence using a recently proposed thermal lattice Boltzmann method (LBT). The goal of our study is both methodological and physical. We assess merits and limitations concerning small- and large-scale resolution/accuracy of the adopted integration scheme. Read More

We numerically study the aging properties of the dynamical heterogeneities in the Ising spin glass. We find that a phase transition takes place during the aging process. Statics-dynamics correspondence implies that systems of finite size in equilibrium have static heterogeneities that obey Finite-Size Scaling, thus signaling an analogous phase transition in the thermodynamical limit. Read More

We study the 3D Disordered Potts Model with p=5 and p=6. Our numerical simulations (that severely slow down for increasing p) detect a very clear spin glass phase transition. We evaluate the critical exponents and the critical value of the temperature, and we use known results at lower $p$ values to discuss how they evolve for increasing p. Read More

Affiliations: 1INFN Frascati, 2INFN Frascati, 3INFN Frascati, 4Univ. and INFN of Pisa, 5Univ. and INFN of Pisa, 6Univ. and INFN of Pisa, 7Univ. and INFN of Pisa, 8Univ. and INFN of Pisa, 9Univ. and INFN of Pisa, 10Univ. and INFN of Pisa, 11Univ. and INFN of Pisa, 12Univ. and INFN of Pisa, 13Univ. and INFN of Pisa, 14Univ. and INFN of Pisa, 15Univ. of Chicago, 16Univ. of Chicago, 17Univ. of Chicago, 18Univ. of Chicago, 19Univ. of Chicago, 20Univ. of Chicago, 21Univ. of Chicago, 22Univ. of Chicago, 23Univ. of Chicago, 24Univ. of Illinois at Urbana-Champaign, 25Univ. of Illinois at Urbana-Champaign, 26Univ. of Illinois at Urbana-Champaign, 27Harvard Univ, 28Harvard Univ, 29Waseda University, 30Waseda University, 31Argonne National Lab, 32Argonne National Lab, 33Univ. and INFN Ferrara

We describe the architecture evolution of the highly-parallel dedicated processor FTK, which is driven by the simulation of LHC events at high luminosity (1034 cm-2 s-1). FTK is able to provide precise on-line track reconstruction for future hadronic collider experiments. The processor, organized in a two-tiered pipelined architecture, execute very fast algorithms based on the use of a large bank of pre-stored patterns of trajectory points (first tier) in combination with full resolution track fitting to refine pattern recognition and to determine off-line quality track parameters. Read More

QPACE is a novel parallel computer which has been developed to be primarily used for lattice QCD simulations. The compute power is provided by the IBM PowerXCell 8i processor, an enhanced version of the Cell processor that is used in the Playstation 3. The QPACE nodes are interconnected by a custom, application optimized 3-dimensional torus network implemented on an FPGA. Read More

We perform numerical simulations, including parallel tempering, on the Potts glass model with binary random quenched couplings using the JANUS application-oriented computer. We find and characterize a glassy transition, estimating the location of the transition and the value of the critical exponents. We show that there is no ferromagnetic transition in a large temperature range around the glassy critical temperature. Read More

Using the dedicated computer Janus, we follow the nonequilibrium dynamics of the Ising spin glass in three dimensions for eleven orders of magnitude. The use of integral estimators for the coherence and correlation lengths allows us to study dynamic heterogeneities and the presence of a replicon mode and to obtain safe bounds on the Edwards-Anderson order parameter below the critical temperature. We obtain good agreement with experimental determinations of the temperature-dependent decay exponents for the thermoremanent magnetization. Read More

We give an overview of the QPACE project, which is pursuing the development of a massively parallel, scalable supercomputer for LQCD. The machine is a three-dimensional torus of identical processing nodes, based on the PowerXCell 8i processor. The nodes are connected by an FPGA-based, application-optimized network processor attached to the PowerXCell 8i processor. Read More

Affiliations: 1the Janus collaboration, 2the Janus collaboration, 3the Janus collaboration, 4the Janus collaboration, 5the Janus collaboration, 6the Janus collaboration, 7the Janus collaboration, 8the Janus collaboration, 9the Janus collaboration, 10the Janus collaboration, 11the Janus collaboration, 12the Janus collaboration, 13the Janus collaboration, 14the Janus collaboration, 15the Janus collaboration, 16the Janus collaboration, 17the Janus collaboration, 18the Janus collaboration, 19the Janus collaboration, 20the Janus collaboration, 21the Janus collaboration

We study numerically the nonequilibrium dynamics of the Ising Spin Glass, for a time that spans eleven orders of magnitude, thus approaching the experimentally relevant scale (i.e. {\em seconds}). Read More

This paper describes JANUS, a modular massively parallel and reconfigurable FPGA-based computing system. Each JANUS module has a computational core and a host. The computational core is a 4x4 array of FPGA-based processing elements with nearest-neighbor data links. Read More

We evaluate IBM's Enhanced Cell Broadband Engine (BE) as a possible building block of a new generation of lattice QCD machines. The Enhanced Cell BE will provide full support of double-precision floating-point arithmetics, including IEEE-compliant rounding. We have developed a performance model and applied it to relevant lattice QCD kernels. Read More

We describe the hardwired implementation of algorithms for Monte Carlo simulations of a large class of spin models. We have implemented these algorithms as VHDL codes and we have mapped them onto a dedicated processor based on a large FPGA device. The measured performance on one such processor is comparable to O(100) carefully programmed high-end PCs: it turns out to be even better for some selected spin models. Read More

Dedicated machines designed for specific computational algorithms can outperform conventional computers by several orders of magnitude. In this note we describe {\it Ianus}, a new generation FPGA based machine and its basic features: hardware integration and wide reprogrammability. Our goal is to build a machine that can fully exploit the performance potential of new generation FPGA devices. Read More

The Rayleigh (Ra) and Prandtl (Pr) number scaling of the Nusselt number Nu, the Reynolds number Re, the temperature fluctuations, and the kinetic and thermal dissipation rates is studied for (numerical) homogeneous Rayleigh-Benard turbulence, i.e., Rayleigh-Benard turbulence with periodic boundary conditions in all directions and a volume forcing of the temperature field by a mean gradient. Read More

We present the APE (Array Processor Experiment) project for the development of dedicated parallel computers for numerical simulations in lattice gauge theories. While APEmille is a production machine in today's physics simulations at various sites in Europe, a new machine, apeNEXT, is currently being developed to provide multi-Tflops computing performance. Like previous APE machines, the new supercomputer is largely custom designed and specifically optimized for simulations of Lattice QCD. Read More

We present the current status of the apeNEXT project. Aim of this project is the development of the next generation of APE machines which will provide multi-teraflop computing power. Like previous machines, apeNEXT is based on a custom designed processor, which is specifically optimized for simulating QCD. Read More

We present new results from a direct numerical simulation of a three dimensional homogeneous Rayleigh-Benard system (HRB), i.e. a convective cell with an imposed linear mean temperature profile along the vertical direction. Read More

We present the current status of the apeNEXT project. Aim of this project is the development of the next generation of APE machines which will provide multi-teraflop computing power. Like previous machines, apeNEXT is based on a custom designed processor, which is specifically optimized for simulating QCD. Read More

We discuss some computational problems associated to matched filtering of experimental signals from gravitational wave interferometric detectors in a parallel-processing environment. We then specialize our discussion to the use of the APEmille and apeNEXT processors for this task. Finally, we accurately estimate the performance of an APEmille system on a computational load appropriate for the LIGO and VIRGO experiments, and extrapolate our results to apeNEXT. Read More

We present new results from high-resolution high-statistics direct numerical simulations of a tri-dimensional convective cell. We test the fundamental physical picture of the presence of both a Bolgiano-like and a Kolmogorov-like regime. We find that the dimensional predictions for these two distinct regimes (characterized respectively by an active and passive role of the temperature field) are consistent with our measurements. Read More

APENEXT is a new generation APE processor, optimized for LGT simulations. The project follows the basic ideas of previous APE machines and develops simple and cheap parallel systems with multi T-Flops processing power. This paper describes the main features of this new development. Read More

This paper presents the status of the APEmille project, which is essentially completed, as far as machine development and construction is concerned. Several large installations of APEmille are in use for physics production runs leading to many new results presented at this conference. This paper briefly summarizes the APEmille architecture, reviews the status of the installations and presents some performance figures for physics codes. Read More

In this paper we discuss some theoretical aspects concerning the scaling laws of the Nusselt number versus the Rayleigh number in a Rayleigh-Benard cell. We present a new set of numerical simulations and compare our findings against the predictions of existing models. We then propose a new theory which relies on the hypothesis of Bolgiano scaling. Read More

We present a parallel FFT algorithm for SIMD systems following the `Transpose Algorithm' approach. The method is based on the assignment of the data field onto a 1-dimensional ring of systolic cells. The systolic array can be universally mapped onto any parallel system. Read More

We report on the progress and status of the APEmille project: a SIMD parallel computer with a peak performance in the TeraFlops range which is now in an advanced development phase. We discuss the hardware and software architecture, and present some performance estimates for Lattice Gauge Theory (LGT) applications. Read More

A class of dynamical models of turbulence living on a one-dimensional dyadic-tree structure is introduced and studied. The models are obtained as a natural generalization of the popular GOY shell model of turbulence. These models are found to be chaotic and intermittent. Read More

In this paper we report numerical and experimental results on the scaling properties of the velocity turbulent fields in several flows. The limits of a new form of scaling, named Extended Self Similarity(ESS), are discussed. We show that, when a mean shear is absent, the self scaling exponents are universal and they do not depend on the specific flow (3D homogeneous turbulence, thermal convection , MHD). Read More

Using a code based on the Lattice Boltzmann Equation, we have performed numerical simulations of a turbulent shear flow. We investigate the scaling behaviour of the structure functions in presence of anisotropic homogeneous turbulence, and we show that although Extended Self Similarity does not hold when strong shear effects are present, a more generalized scaling law can still be defined. Read More

We discuss a possible theoretical interpretation of the self scaling property of turbulent flows (Extended Self Similarity). Our interpretation predicts that, even in cases when ESS is not observed, a generalized self scaling, must be observed. This prediction is checked on a number of laboratory experiments and direct numerical simulations. Read More

In this letter we present numerical and experimental results on the scaling properties of velocity turbulent fields in the range of scales where viscous effects are acting. A generalized version of Extended Self Similarity capable of describing scaling laws of the velocity structure functions down to the smallest resolvable scales is introduced. Our findings suggest the absence of any sharp viscous cutoff in the intermittent transfer of energy. Read More