Lower Bounds and Algorithm for Partially Replicated Causally Consistent Shared Memory

Distributed shared memory systems maintain multiple replicas of the shared memory locations. Maintaining causal consistency in such systems has received significant attention in the past. However, much of the previous literature focuses on full replication wherein each replica stores a copy of all the locations in the shared memory. In this paper, we investigate causal consistency in partially replicated systems, wherein each replica may store only a subset of the shared data. To achieve causal consistency, it is necessary to ensure that, before an update is performed at any given replica, all causally preceding updates must also be performed. Achieving this goal requires some mechanism to track causal dependencies. In the context of full replication, this goal is often achieved using vector timestamps, with the number of vector elements being equal to the number of replicas. Building on the past work, this paper makes three key contributions: 1. We develop lower bounds on the size of the timestamps that must be maintained in order to achieve causal consistency in partially replicated systems. The size of the timestamps is a function of the manner in which the replicas share data, and the set of replicas accessed by each client. 2. We present an algorithm to achieve causal consistency in partially replicated systems using simple vector timestamps. 3. We present some optimizations to improve the overhead of the timestamps required with partial replication.

Similar Publications

Distributed actor languages are an effective means of constructing scalable reliable systems, and the Erlang programming language has a well-established and influential model. While Erlang model conceptually provides reliable scalability, it has some inherent scalability limits and these force developers to depart from the model at scale. This article establishes the scalability limits of Erlang systems, and reports the work to improve the language scalability. Read More

We adapt a recent algorithm by Ghaffari [SODA'16] for computing a Maximal Independent Set in the LOCAL model, so that it works in the significantly weaker BEEP model. For networks with maximum degree $\Delta$, our algorithm terminates locally within time $O((\log \Delta + \log (1/\epsilon)) \cdot \log(1/\epsilon))$, with probability at least $1 - \epsilon$. The key idea of the modification is to replace explicit messages about transmission probabilities with estimates based on the number of received messages. Read More

Session types offer a type-based discipline for enforcing communication protocols in distributed programming. We have previously formalized simple session types in the setting of multi-threaded $\lambda$-calculus with linear types. In this work, we build upon our earlier work by presenting a form of dependent session types (of DML-style). Read More

ROOT provides an flexible format used throughout the HEP community. The number of use cases - from an archival data format to end-stage analysis - has required a number of tradeoffs to be exposed to the user. For example, a high "compression level" in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU time when read). Read More

In this article, we present a novel approach for block-structured adaptive mesh refinement (AMR) that is suitable for extreme-scale parallelism. All data structures are designed such that the size of the meta data in each distributed processor memory remains bounded independent of the processor number. In all stages of the AMR process, we use only distributed algorithms. Read More

In this paper, the fundamental problem of distribution and proactive caching of computing tasks in fog networks is studied under latency and reliability constraints. In the proposed scenario, computing can be executed either locally at the user device or offloaded to an edge cloudlet. Moreover, cloudlets exploit both their computing and storage capabilities by proactively caching popular task computation results to minimize computing latency. Read More

Many cluster management systems (CMSs) have been proposed to share a single cluster with multiple distributed computing systems. However, none of the existing approaches can handle distributed machine learning (ML) workloads given the following criteria: high resource utilization, fair resource allocation and low sharing overhead. To solve this problem, we propose a new CMS named Dorm, incorporating a dynamically-partitioned cluster management mechanism and an utilization-fairness optimizer. Read More

This paper presents the pessimistic time complexity analysis of the parallel algorithm for minimizing the fleet size in the pickup and delivery problem with time windows. We show how to estimate the pessimistic complexity step by step. This approach can be easily adopted to other parallel algorithms for solving complex transportation problems. Read More

With the surge of multi- and manycores, much research has focused on algorithms for mapping and scheduling on these complex platforms. Large classes of these algorithms face scalability problems. This is why diverse methods are commonly used for reducing the search space. Read More

In asynchronous distributed systems it is very hard to assess if one of the processes taking part in a computation is operating correctly or has failed. To overcome this problem, distributed algorithms are created using unreliable failure detectors that capture in an abstract way timing assumptions necessary to assess the operating status of a process. One particular type of failure detector is a leader election, that indicates a single process that has not failed. Read More