Computer Science - Numerical Analysis Publications (50)


Computer Science - Numerical Analysis Publications

In this paper we consider a distributed optimization scenario in which the aggregate objective function to minimize is partitioned, big-data and possibly non-convex. Specifically, we focus on a set-up in which the dimension of the decision variable depends on the network size as well as the number of local functions, but each local function handled by a node depends only on a (small) portion of the entire optimization variable. This problem set-up has been shown to appear in many interesting network application scenarios. Read More

Motivated by economic dispatch and linearly-constrained resource allocation problems, this paper proposes a novel Distributed Approx-Newton algorithm that approximates the standard Newton optimization method. A main property of this distributed algorithm is that it only requires agents to exchange constant-size communication messages. The convergence of this algorithm is discussed and rigorously analyzed. Read More

We study the effect of adaptive mesh refinement on a parallel domain decomposition solver of a linear system of algebraic equations. These concepts need to be combined within a parallel adaptive finite element software. A prototype implementation is presented for this purpose. Read More

Kernel quadratures and other kernel-based approximation methods typically suffer from prohibitive cubic time and quadratic space complexity in the number of function evaluations. The problem arises because a system of linear equations needs to be solved. In this article we show that the weights of a kernel quadrature rule can be computed efficiently and exactly for up to tens of millions of nodes if the kernel, integration domain, and measure are fully symmetric and the node set is a union of fully symmetric sets. Read More

Singular values of a data in a matrix form provide insights on the structure of the data, the effective dimensionality, and the choice of hyper-parameters on higher-level data analysis tools. However, in many practical applications such as collaborative filtering and network analysis, we only get a partial observation. Under such scenarios, we consider the fundamental problem of recovering spectral properties of the underlying matrix from a sampling of its entries. Read More

A novel and scalable geometric multi-level algorithm is presented for the numerical solution of elliptic partial differential equations, specially designed to run with high occupancy of streaming processors inside Graphics Processing Units(GPUs). The algorithm consists of iterative, superposed operations on a single grid, and it is composed of two simple full-grid routines: a restriction and a coarsened interpolation-relaxation. The restriction is used to collect sources using recursive coarsened averages, and the interpolation-relaxation simultaneously applies coarsened finite-difference operators and interpolations. Read More

Matrix and tensor completion aim to recover a low-rank matrix / tensor from limited observations and have been commonly used in applications such as recommender systems and multi-relational data mining. A state-of-the-art matrix completion algorithm is Soft-Impute, which exploits the special "sparse plus low-rank" structure of the matrix iterates to allow efficient SVD in each iteration. Though Soft-Impute is a proximal algorithm, it is generally believed that acceleration destroys the special structure and is thus not useful. Read More

Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite number of loss functions. The present paper proposes a Riemannian stochastic quasi-Newton algorithm with variance reduction (R-SQN-VR). The key challenges of averaging, adding, and subtracting multiple gradients are addressed with notions of retraction and vector transport. Read More

This work proposes a space-time least-squares Petrov-Galerkin (ST-LSPG) projection method for model reduction of nonlinear dynamical systems. In contrast to typical nonlinear model-reduction methods that first apply (Petrov-)Galerkin projection in the spatial dimension and subsequently apply time integration to numerically resolve the resulting low-dimensional dynamical system, the proposed method applies projection in space and time simultaneously. To accomplish this, the method first introduces a low-dimensional space-time trial subspace, which can be obtained by computing tensor decompositions of state-snapshot data. Read More

In exploratory tensor mining, a common problem is how to analyze a set of variables across a set of subjects whose observations do not align naturally. For example, when modeling medical features across a set of patients, the number and duration of treatments may vary widely in time, meaning there is no meaningful way to align their clinical records across time points for analysis purposes. To handle such data, the state-of-the-art tensor model is the so-called PARAFAC2, which yields interpretable and robust output and can naturally handle sparse data. Read More

In this letter, we propose an algorithm for recovery of sparse and low rank components of matrices using an iterative method with adaptive thresholding. In each iteration, the low rank and sparse components are obtained using a thresholding operator. This algorithm is fast and can be implemented easily. Read More

Approximate computing has shown to provide new ways to improve performance and power consumption of error-resilient applications. While many of these applications can be found in image processing, data classification or machine learning, we demonstrate its suitability to a problem from scientific computing. Utilizing the self-correcting behavior of iterative algorithms, we show that approximate computing can be applied to the calculation of inverse matrix p-th roots which are required in many applications in scientific computing. Read More

This research investigates the implementation mechanism of block-wise ILU(k) preconditioner on GPU. The block-wise ILU(k) algorithm requires both the level k and the block size to be designed as variables. A decoupled ILU(k) algorithm consists of a symbolic phase and a factorization phase. Read More

In this paper, we present a parallel numerical algorithm for solving the phase field crystal equation. In the algorithm, a semi-implicit finite difference scheme is derived based on the discrete variational derivative method. Theoretical analysis is provided to show that the scheme is unconditionally energy stable and can achieve second-order accuracy in both space and time. Read More

Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank representations of matrices, and giving principled predictions of missing values. However, scaling up MCMC samplers to large matrices has proven to be difficult with parallel algorithms that require communication between MCMC iterations. On the other hand, designing communication-free algorithms is challenging due to the inherent unidentifiability of BMF solutions. Read More

In this paper, we introduce and provide a short overview of nonnegative matrix factorization (NMF). Several aspects of NMF are discussed, namely, the application in hyperspectral imaging, geometry and uniqueness of NMF solutions, complexity, algorithms, and its link with extended formulations of polyhedra. In order to put NMF into perspective, the more general problem class of constrained low-rank matrix approximation problems is first briefly introduced. Read More

The FLAME methodology makes it possible to derive provably correct algorithms from a formal description of a linear algebra problem. So far, the methodology has been successfully used to automate the derivation of direct algorithms such as the Cholesky decomposition and the solution of Sylvester equations. In this thesis, we present an extension of the FLAME methodology to tackle iterative methods such as Conjugate Gradient. Read More

We solve tensor balancing, rescaling an Nth order nonnegative tensor by multiplying (N - 1)th order N tensors so that every fiber sums to one. This generalizes a fundamental process of matrix balancing used to compare matrices in a wide range of applications from biology to economics. We present an efficient balancing algorithm with quadratic convergence using Newton's method and show in numerical experiments that the proposed algorithm is several orders of magnitude faster than existing ones. Read More

Many machine learning models are reformulated as optimization problems. Thus, it is important to solve a large-scale optimization problem in big data applications. Recently, subsampled Newton methods have emerged to attract much attention for optimization due to their efficiency at each iteration, rectified a weakness in the ordinary Newton method of suffering a high cost in each iteration while commanding a high convergence rate. Read More

We develop and analyze efficient "coordinate-wise" methods for finding the leading eigenvector, where each step involves only a vector-vector product. We establish global convergence with overall runtime guarantees that are at least as good as Lanczos's method and dominate it for slowly decaying spectrum. Our methods are based on combining a shift-and-invert approach with coordinate-wise algorithms for linear regression. Read More

We describe stochastic Newton and stochastic quasi-Newton approaches to efficiently solve large linear least-squares problems where the very large data sets present a significant computational burden (e.g., the size may exceed computer memory or data are collected in real-time). Read More

A new simple convergence acceleration method is proposed for a certain wide range class of convergent alternating series. The method has some common features with Smith's and Ford's modification of Levin's and Weniger's sequence transformations, but it is computationally less expensive. The similarities and differences between all three methods are analyzed and some common theoretical results are given. Read More

In an effort to increase the versatility of finite element codes, we explore the possibility of automatically creating the Jacobian matrix necessary for the gradient-based solution of nonlinear systems of equations. Particularly, we aim to assess the feasibility of employing the automatic differentiation tool TAPENADE for this purpose on a large Fortran codebase that is the result of many years of continuous development. As a starting point we will describe the special structure of finite element codes and the implications that this code design carries for an efficient calculation of the Jacobian matrix. Read More

Boolean matrix factorisation aims to decompose a binary data matrix into an approximate Boolean product of two low rank, binary matrices: one containing meaningful patterns, the other quantifying how the observations can be expressed as a combination of these patterns. We introduce the OrMachine, a probabilistic generative model for Boolean matrix factorisation and derive a Metropolised Gibbs sampler that facilitates efficient parallel posterior inference. On real world and simulated data, our method outperforms all currently existing approaches for Boolean matrix factorisation and completion. Read More

Block Coordinate Update (BCU) methods enjoy low per-update computational complexity because every time only one or a few block variables would need to be updated among possibly a large number of blocks. They are also easily parallelized and thus have been particularly popular for solving problems involving large-scale dataset and/or variables. In this paper, we propose a primal-dual BCU method for solving linearly constrained convex program in multi-block variables. Read More

We address the statistical and optimization impacts of using classical or Hessian sketch to approximately solve the Matrix Ridge Regression (MRR) problem. Prior research has considered the effects of classical sketch on least squares regression (LSR), a strictly simpler problem. We establish that classical sketch has a similar effect upon the optimization properties of MRR as it does on those of LSR---namely, it recovers nearly optimal solutions. Read More

For a wide variety of problems, creating detailed continuous models of (continuous) physical systems is, at the very least, impractical. Hybrid models can abstract away short transient behaviour (thus introducing discontinuities) in order to simplify the study of such systems. For example, when modelling a bouncing ball, the bounce can be abstracted as a discontinuous change of the velocity, instead of resorting to the physics of the ball (de-)compression to keep the velocity signal continuous. Read More

With the huge influx of various data nowadays, extracting knowledge from them has become an interesting but tedious task among data scientists, particularly when the data come in heterogeneous form and have missing information. Many data completion techniques had been introduced, especially in the advent of kernel methods. However, among the many data completion techniques available in the literature, studies about mutually completing several incomplete kernel matrices have not been given much attention yet. Read More

The emergent field of probabilistic numerics has thus far lacked rigorous statistical principals. This paper establishes Bayesian probabilistic numerical methods as those which can be cast as solutions to certain Bayesian inverse problems, albeit problems that are non-standard. This allows us to establish general conditions under which Bayesian probabilistic numerical methods are well-defined, encompassing both non-linear and non-Gaussian models. Read More

This paper presents a randomized algorithm for computing the near-optimal low-rank dynamic mode decomposition (DMD). Randomized algorithms are emerging techniques to compute low-rank matrix approximations. They are able to ease the computational challenges arising in the area of big data. Read More

Three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy preconditioner. The multigroup Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. The new multigrid in energy preconditioner reduces iteration count for many problem types and takes advantage of the new energy decomposition such that it can scale efficiently. Read More

Lower bounds on the smallest eigenvalue of a symmetric positive definite matrices $A\in\mathbb{R}^{m\times m}$ play an important role in condition number estimation and in iterative methods for singular value computation. In particular, the bounds based on ${\rm Tr}(A^{-1})$ and ${\rm Tr}(A^{-2})$ attract attention recently because they can be computed in $O(m)$ work when $A$ is tridiagonal. In this paper, we focus on these bounds and investigate their properties in detail. Read More

This paper proposes an efficient method for computing selected generalized eigenpairs of a sparse Hermitian definite matrix pencil $(A,B)$. Based on Zolotarev's best rational function approximations of the signum function and conformal maps, we construct the best rational function approximation of a rectangular function supported on an arbitrary interval. This new best rational function approximation is applied to construct spectrum filters of $(A,B)$. Read More

This document contains the notes of a lecture I gave at the "Journ\'ees Nationales du Calcul Formel" (JNCF) on January 2017. The aim of the lecture was to discuss low-level algorithmics for p-adic numbers. It is divided into two main parts: first, we present various implementations of p-adic numbers and compare them and second, we introduce a general framework for studying precision issues and apply it in several concrete situations. Read More

The CANDECOMP/PARAFAC (CP) decomposition is a leading method for the analysis of multiway data. The standard alternating least squares algorithm for the CP decomposition (CP-ALS) involves a series of highly overdetermined linear least squares problems. We extend randomized least squares methods to tensors and show the workload of CP-ALS can be drastically reduced without a sacrifice in quality. Read More

Affiliations: 1Virginia Tech, 2Virginia Tech, 3Virginia Tech, 4University of California, Merced

Exponential integrators are special time discretization methods where the traditional linear system solves used by implicit schemes are replaced with computing the action of matrix exponential-like functions on a vector. A very general formulation of exponential integrators is offered by the Exponential Propagation Iterative methods of Runge-Kutta type (EPIRK) family of schemes. The use of Jacobian approximations is an important strategy to drastically reduce the overall computational costs of implicit schemes while maintaining the quality of their solutions. Read More

In this paper a new hp-adaptive strategy for elliptic problems based on refinement history is proposed, which chooses h-, p- or hp-refinement on individual elements according to a posteriori error estimate, as well as smoothness estimate of the solution obtained by comparing the actual and expected error reduction rate. Numerical experiments show that exponential convergence can be achieved with this strategy. Read More

Implicit schemes have been extensively used in building physics to compute the solution of moisture diffusion problems in porous materials for improving stability conditions. Nevertheless, these schemes require important sub-iterations when treating non-linear problems. To overcome this disadvantage, this paper explores the use of improved explicit schemes, such as Dufort-Frankel, Crank-Nicolson and hyperbolisation approaches. Read More

High order cumulants carry information about statistics of non--normally distributed multivariate data. Such cumulants are utilised in extreme events analysis, small target detection or outliers detection. In this work we present a new algorithm,for updating high order cumulant tensors of random multivariate data, if new package of data is recorded. Read More

In this paper we introduce a novel algorithm of calculating arbitrary order cumulants of multidimensional data. Since the n th order cumulant can be presented in the form of an n-dimensional tensor, the algorithm is presented using the tensor network notation. The presented algorithm exploits the super--symmetry of cumulant and moment tensors. Read More

This paper develops meshless methods for probabilistically describing discretisation error in the numerical solution of partial differential equations. This construction enables the solution of Bayesian inverse problems while accounting for the impact of the discretisation of the forward problem. In particular, this drives statistical inferences to be more conservative in the presence of significant solver error. Read More

On modern large-scale parallel computers, the performance of Krylov subspace iterative methods is limited by global synchronization. This has inspired the development of $s$-step Krylov subspace method variants, in which iterations are computed in blocks of $s$, which can reduce the number of global synchronizations per iteration by a factor of $O(s)$. Although the $s$-step variants are mathematically equivalent to their classical counterparts, they can behave quite differently in finite precision depending on the parameter $s$. Read More

This paper introduces multivariate input-output models to predict the errors and bases dimensions of local parametric Proper Orthogonal Decomposition reduced-order models. We refer to these multivariate mappings as the MP-LROM models. We employ Gaussian Processes and Artificial Neural Networks to construct approximations of these multivariate mappings. Read More

A fully (pseudo-)spectral solver for direct numerical simulations of large-scale turbulent channel flows is described. The solver utilizes the Chebyshev base functions suggested by J. Shen [SIAM J. Read More

In this work, we propose two-level space-time domain decomposition preconditioners for parabolic problems discretized using finite elements. They are motivated as an extension to space-time of balancing domain decomposition by constraints preconditioners. The key ingredients to be defined are the sub-assembled space and operator, the coarse degrees of freedom (DOFs) in which we want to enforce continuity among subdomains at the preconditioner level, and the transfer operator from the sub-assembled to the original finite element space. Read More

This work proposes a machine-learning-based framework for estimating the error introduced by surrogate models of parameterized dynamical systems. The framework applies high-dimensional regression techniques (e.g. Read More

Kernel matrices appear in machine learning and non-parametric statistics. Given $N$ points in $d$ dimensions and a kernel function that requires $\mathcal{O}(d)$ work to evaluate, we present an $\mathcal{O}(dN\log N)$-work algorithm for the approximate factorization of a regularized kernel matrix, a common computational bottleneck in the training phase of a learning task. With this factorization, solving a linear system with a kernel matrix can be done with $\mathcal{O}(N\log N)$ work. Read More

Differential (Ore) type polynomials with "approximate" polynomial coefficients are introduced. These provide an effective notion of approximate differential operators, with a strong algebraic structure. We introduce the approximate Greatest Common Right Divisor Problem (GCRD) of differential polynomials, as a non-commutative generalization of the well-studied approximate GCD problem. Read More

Design of filters for graph signal processing benefits from knowledge of the spectral decomposition of matrices that encode graphs, such as the adjacency matrix and the Laplacian matrix, used to define the shift operator. For shift matrices with real eigenvalues, which arise for symmetric graphs, the empirical spectral distribution captures the eigenvalue locations. Under realistic circumstances, stochastic influences often affect the network structure and, consequently, the shift matrix empirical spectral distribution. Read More