How to Scale Exponential Backoff

Randomized exponential backoff is a widely deployed technique for coordinating access to a shared resource. A good backoff protocol should, arguably, satisfy three natural properties: (i) it should provide constant throughput, wasting as little time as possible; (ii) it should require few failed access attempts, minimizing the amount of wasted effort; and (iii) it should be robust, continuing to work efficiently even if some of the access attempts fail for spurious reasons. Unfortunately, exponential backoff has some well-known limitations in two of these areas: it provides poor (sub-constant) throughput (in the worst case), and is not robust (to resource acquisition failures). The goal of this paper is to "fix" exponential backoff by making it scalable, particularly focusing on the case where processes arrive in an on-line, worst-case fashion. We present a relatively simple backoff protocol~Re-Backoff~that has, at its heart, a version of exponential backoff. It guarantees expected constant throughput with dynamic process arrivals and requires only an expected polylogarithmic number of access attempts per process. Re-Backoff is also robust to periods where the shared resource is unavailable for a period of time. If it is unavailable for $D$ time slots, Re-Backoff provides the following guarantees. When the number of packets is a finite $n$, the average expected number of access attempts for successfully sending a packet is $O(\log^2( n + D))$. In the infinite case, the average expected number of access attempts for successfully sending a packet is $O( \log^2(\eta) + \log^2(D) )$ where $\eta$ is the maximum number of processes that are ever in the system concurrently.


Similar Publications

Widespread deployment of distributed machine learning algorithms has raised new privacy challenges. The focus of this paper is on improving privacy of each participant's local information (such as dataset or loss function) while collaboratively learning underlying model. We present two iterative algorithms for privacy preserving distributed learning. Read More


It has been shown that in-memory computing systems suffer from serious memory pressure. The memory pressure will affect all submitted jobs. Memory pressure comes from the running tasks as they produce massive long-living data objects in the limited memory space. Read More


Distributed Computation has been a recent trend in engineering research. Parallel Computation is widely used in different areas of Data Mining, Image Processing, Simulating Models, Aerodynamics and so forth. One of the major usage of Parallel Processing is widely implemented for clustering the satellite images of size more than dimension of 1000x1000 in a legacy system. Read More


We present WPaxos, a multileader wide area network (WAN) Paxos protocol, that achieves low-latency high-throughput consensus across WAN deployments. WPaxos dynamically partitions the global object-space across multiple concurrent leaders that are deployed strategically using flexible quorums. This partitioning and emphasis on local operations allow our protocol to significantly outperform leaderless approaches, such as EPaxos, while maintaining the same consistency guarantees. Read More


Considering a network with $n$ nodes, where each node initially votes for one (or more) choices out of $K$ possible choices, we present a Distributed Multi-choice Voting/Ranking (DMVR) algorithm to determine either the choice with maximum vote (the voting problem) or to rank all the choices in terms of their acquired votes (the ranking problem). The algorithm consolidates node votes across the network by updating the states of interacting nodes using two key operations, the union and the intersection. The proposed algorithm is simple, independent from network size, and easily scalable in terms of the number of choices $K$, using only $K\times 2^{K-1}$ nodal states for voting, and $K\times K!$ nodal states for ranking. Read More


In distributed function computation, each node has an initial value and the goal is to compute a function of these values in a distributed manner. In this paper, we propose a novel token-based approach to compute a wide class of target functions to which we refer as "Token-based function Computation with Memory" (TCM) algorithm. In this approach, node values are attached to tokens and travel across the network. Read More


Iterative load balancing algorithms for indivisible tokens have been studied intensively in the past. Complementing previous worst-case analyses, we study an average-case scenario where the load inputs are drawn from a fixed probability distribution. For cycles, tori, hypercubes and expanders, we obtain almost matching upper and lower bounds on the discrepancy, the difference between the maximum and the minimum load. Read More


The asynchronous computability theorem (ACT) uses concepts from combinatorial topology to characterize which tasks have wait-free solutions in read-write memory. A task can be expressed as a relation between two chromatic simplicial complexes. The theorem states that a task has a protocol (algorithm) if and only if there is a certain chromatic simplicial map compatible with that relation. Read More


This paper reports on the state of the art of virtualization technology for both general purpose domains as well as real-time domains. There exits no entirely instantaneous data transmission/transfer. There always exist a delay while transmitting data, either in the processing or in the medium itself. Read More


In this paper we consider a distributed optimization scenario in which a set of processors aims at minimizing the maximum of a collection of "separable convex functions" subject to local constraints. This set-up is motivated by peak-demand minimization problems in smart grids. Here, the goal is to minimize the peak value over a finite horizon with: (i) the demand at each time instant being the sum of contributions from different devices, and (ii) the local states at different time instants being coupled through local dynamics. Read More