Cost-oblivious storage reallocation

Databases need to allocate and free blocks of storage on disk. Freed blocks introduce holes where no data is stored. Allocation systems attempt to reuse such deallocated regions in order to minimize the footprint on disk. If previously allocated blocks cannot be moved, the problem is called the memory allocation problem, which is known to have a logarithmic overhead in the footprint. This paper defines the storage reallocation problem, where previously allocated blocks can be moved, or reallocated, but at some cost. The algorithms presented here are cost oblivious, in that they work for a broad and reasonable class of cost functions, even when they do not know what the cost function is. The objective is to minimize the storage footprint, that is, the largest memory address containing an allocated object, while simultaneously minimizing the reallocation costs. This paper gives asymptotically optimal algorithms for storage reallocation, in which the storage footprint is at most (1+epsilon) times optimal, and the reallocation cost is at most (1/epsilon) times the original allocation cost, which is also optimal. The algorithms are cost oblivious as long as the allocation/reallocation cost function is subadditive.

Comments: 11 pages, 2 figures, to appear in PODS 2014. Added and updated references

Similar Publications

We study the strong duality of non-convex matrix factorization: we show under certain dual conditions, non-convex matrix factorization and its dual have the same optimum. This has been well understood for convex optimization, but little was known for matrix factorization. We formalize the strong duality of matrix factorization through a novel analytical framework, and show that the duality gap is zero for a wide class of matrix factorization problems. Read More


We show that the integrality gap of the bidirected cut relaxation for the Steiner tree problem is at most 6/5 via a primal-dual schema based algorithm. Read More


The study of Dense-$3$-Subhypergraph problem was initiated in Chlamt{\'{a}}c et al. [Approx'16]. The input is a universe $U$ and collection ${\cal S}$ of subsets of $U$, each of size $3$, and a number $k$. Read More


Finding central nodes is a fundamental problem in network analysis. Betweenness centrality is a well-known measure which quantifies the importance of a node based on the fraction of shortest paths going though it. Due to the dynamic nature of many today's networks, algorithms that quickly update centrality scores have become a necessity. Read More


Re-Pair is an efficient grammar compressor that operates by recursively replacing high-frequency character pairs with new grammar symbols. The most space-efficient linear-time algorithm computing Re-Pair uses $(1+\epsilon)n+\sqrt n$ words on top of the re-writable text (of length $n$ and stored in $n$ words), for any constant $\epsilon>0$; in practice however, this solution uses complex sub-procedures preventing it from being practical. In this paper, we present an implementation of the above-mentioned result making use of more practical solutions; our tool further improves the working space to $(1. Read More


The paper develops a new technique to extract a characteristic subset from a random source that repeatedly samples from a set of elements. Here a characteristic subset is a set that when containing an element contains all elements that have the same probability. With this technique at hand the paper looks at the special case of the tournament isomorphism problem that stands in the way towards a polynomial-time algorithm for the graph isomorphism problem. Read More


Through the development of efficient algorithms, data structures and preprocessing techniques, real-world shortest path problems in street networks are now very fast to solve. But in reality, the exact travel times along each arc in the network may not be known. This lead to the development of robust shortest path problems, where all possible arc travel times are contained in a so-called uncertainty set of possible outcomes. Read More


For a positive parameter $\beta$, the $\beta$-bounded distance between a pair of vertices $u,v$ in a weighted undirected graph $G = (V,E,\omega)$ is the length of the shortest $u-v$ path in $G$ with at most $\beta$ edges, aka {\em hops}. For $\beta$ as above and $\epsilon>0$, a {\em $(\beta,\epsilon)$-hopset} of $G = (V,E,\omega)$ is a graph $G' =(V,H,\omega_H)$ on the same vertex set, such that all distances in $G$ are $(1+\epsilon)$-approximated by $\beta$-bounded distances in $G\cup G'$. Hopsets are a fundamental graph-theoretic and graph-algorithmic construct, and they are widely used for distance-related problems in a variety of computational settings. Read More


We consider the communication complexity of finding an approximate maximum matching in a graph in a multi-party message-passing communication model. The maximum matching problem is one of the most fundamental graph combinatorial problems, with a variety of applications. The input to the problem is a graph $G$ that has $n$ vertices and the set of edges partitioned over $k$ sites, and an approximation ratio parameter $\alpha$. Read More


A novel landmark-based oracle (CFLAT) is presented, which provides earliest-arrival-time route plans in time-dependent road networks. To our knowledge, this is the first oracle that preprocesses combinatorial structures (collections of time-stamped min-travel-time-path trees) rather than travel-time functions. The preprocessed data structure is exploited by a new query algorithm (CFCA) which computes (and pays for it), apart from earliest-arrival-time estimations, the actual connecting path that preserves the theoretical approximation guarantees. Read More