Cost-oblivious storage reallocation

Databases need to allocate and free blocks of storage on disk. Freed blocks introduce holes where no data is stored. Allocation systems attempt to reuse such deallocated regions in order to minimize the footprint on disk. If previously allocated blocks cannot be moved, the problem is called the memory allocation problem, which is known to have a logarithmic overhead in the footprint. This paper defines the storage reallocation problem, where previously allocated blocks can be moved, or reallocated, but at some cost. The algorithms presented here are cost oblivious, in that they work for a broad and reasonable class of cost functions, even when they do not know what the cost function is. The objective is to minimize the storage footprint, that is, the largest memory address containing an allocated object, while simultaneously minimizing the reallocation costs. This paper gives asymptotically optimal algorithms for storage reallocation, in which the storage footprint is at most (1+epsilon) times optimal, and the reallocation cost is at most (1/epsilon) times the original allocation cost, which is also optimal. The algorithms are cost oblivious as long as the allocation/reallocation cost function is subadditive.

Comments: 11 pages, 2 figures, to appear in PODS 2014. Added and updated references

Similar Publications

We study the stable matching problem in non-bipartite graphs with incomplete but strict preference lists, where the edges have weights and the goal is to compute a stable matching of minimum or maximum weight. This problem is known to be NP-hard in general. Our contribution is two fold: a polyhedral characterization and an approximation algorithm. Read More


Gene tree/species tree reconciliation is a recent decisive progress in phylo-genetic methods, accounting for the possible differences between gene histories and species histories. Reconciliation consists in explaining these differences by gene-scale events such as duplication, loss, transfer, which translates mathematically into a mapping between gene tree nodes and species tree nodes or branches. Gene conversion is a very frequent biological event, which results in the replacement of a gene by a copy of another from the same species and in the same gene tree. Read More


The edit distance between two rooted ordered trees with $n$ nodes labeled from an alphabet~$\Sigma$ is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. Tree edit distance is a well known generalization of string edit distance. The fastest known algorithm for tree edit distance runs in cubic $O(n^3)$ time and is based on a similar dynamic programming solution as string edit distance. Read More


Identifying palindromes in sequences has been an interesting line of research in combinatorics on words and also in computational biology, after the discovery of the relation of palindromes in the DNA sequence with the HIV virus. Efficient algorithms for the factorization of sequences into palindromes and maximal palindromes have been devised in recent years. We extend these studies by allowing gaps in decompositions and errors in palindromes, and also imposing a lower bound to the length of acceptable palindromes. Read More


Given a weighted graph $G=(V,E,w)$ with a set of $k$ terminals $T\subset V$, the Steiner Point Removal problem seeks for a minor of the graph with vertex set $T$, such that the distance between every pair of terminals is preserved within a small multiplicative distortion. Kamma, Krauthgamer and Nguyen (SODA 2014, SICOMP 2015) used a ball-growing algorithm to show that the distortion is at most $\mathcal{O}(\log^5 k)$ for general graphs. In this paper, we improve the distortion bound to $\mathcal{O}(\log^2 k)$. Read More


Iterative load balancing algorithms for indivisible tokens have been studied intensively in the past. Complementing previous worst-case analyses, we study an average-case scenario where the load inputs are drawn from a fixed probability distribution. For cycles, tori, hypercubes and expanders, we obtain almost matching upper and lower bounds on the discrepancy, the difference between the maximum and the minimum load. Read More


This paper attacks the following problem. We are given a large number $N$ of rectangles in the plane, each with horizontal and vertical sides, and also a number $rRead More


We develop polynomial-time heuristic methods to solve unimodular quadratic programs (UQPs) approximately, which are known to be NP-hard. In the UQP framework, we maximize a quadratic function of a vector of complex variables with unit modulus. Several problems in active sensing and wireless communication applications boil down to UQP. Read More


If f is a Boolean function given by a BDD then it is well known how to calculate the number of models (i.e. bitstrings x with f(x)=1). Read More


Let $(\{1,2,\ldots,n\},d)$ be a metric space. We analyze the expected value and the variance of $\sum_{i=1}^{\lfloor n/2\rfloor}\,d({\boldsymbol{\pi}}(2i-1),{\boldsymbol{\pi}}(2i))$ for a uniformly random permutation ${\boldsymbol{\pi}}$ of $\{1,2,\ldots,n\}$, leading to the following results: (I) Consider the problem of finding a point in $\{1,2,\ldots,n\}$ with the minimum sum of distances to all points. We show that this problem has a randomized algorithm that (1) always outputs a $(2+\epsilon)$-approximate solution in expected $O(n/\epsilon^2)$ time and that (2) inherits Indyk's~\cite{Ind99, Ind00} algorithm to output a $(1+\epsilon)$-approximate solution in $O(n/\epsilon^2)$ time with probability $\Omega(1)$, where $\epsilon\in(0,1)$. Read More