# Quadratic and Near-Quadratic Lower Bounds for the CONGEST Model

We present the first super-linear lower bounds for natural graph problems in the CONGEST model, answering a long-standing open question. Specifically, we show that any exact computation of a minimum vertex cover or a maximum independent set requires $\Omega(n^2/\log^2{n})$ rounds in the worst case in the CONGEST model, as well as any algorithm for $\chi$-coloring a graph, where $\chi$ is the chromatic number of the graph. We further show that such strong lower bounds are not limited to NP-hard problems, by showing two simple graph problems in P which require a quadratic and near-quadratic number of rounds. Finally, we address the problem of computing an exact solution to weighted all-pairs-shortest-paths (APSP), which arguably may be considered as a candidate for having a super-linear lower bound. We show a simple $\Omega(n)$ lower bound for this problem, which implies a separation between the weighted and unweighted cases, since the latter is known to have a complexity of $\Theta(n/\log{n})$. We also formally prove that the standard Alice-Bob framework is incapable of providing a super-linear lower bound for exact weighted APSP, whose complexity remains an intriguing open question.

## Similar Publications

Consider a set of agents in a peer-to-peer communication network, where each agent has a personal dataset and a personal learning objective. The main question addressed in this paper is: how can agents collaborate to improve upon their locally learned model without leaking sensitive information about their data? Our first contribution is to reformulate this problem so that it can be solved by a block coordinate descent algorithm. We obtain an efficient and fully decentralized protocol working in an asynchronous fashion. Read More

We analyze the caching overhead incurred by a class of multithreaded algorithms when scheduled by an arbitrary scheduler. We obtain bounds that match or improve upon the well-known $O(Q+S \cdot (M/B))$ caching cost for the randomized work stealing (RWS) scheduler, where $S$ is the number of steals, $Q$ is the sequential caching cost, and $M$ and $B$ are the cache size and block (or cache line) size respectively. Read More

A common approach for designing scalable algorithms for massive data sets is to distribute the computation across, say $k$, machines and process the data using limited communication between them. A particularly appealing framework here is the simultaneous communication model whereby each machine constructs a small representative summary of its own data and one obtains an approximate/exact solution from the union of the representative summaries. If the representative summaries needed for a problem are small, then this results in a communication-efficient and round-optimal protocol. Read More

The massive quantities of genomic data being made available through gene sequencing techniques are enabling breakthroughs in genomic science in many areas such as medical advances in the diagnosis and treatment of diseases. Analyzing this data, however, is a computational challenge insofar as the computational costs of the relevant algorithms can grow with quadratic, cubic or higher complexity--leading to the need for leadership scale computing. In this paper we describe a new approach to calculations of the Custom Correlation Coefficient (CCC) between Single Nucleotide Polymorphisms (SNPs) across a population, suitable for parallel systems equipped with graphics processing units (GPUs) or Intel Xeon Phi processors. Read More

The surge in availability of genomic data holds promise for enabling determination of genetic causes of observed individual traits, with applications to problems such as discovery of the genetic roots of phenotypes, be they molecular phenotypes such as gene expression or metabolite concentrations, or complex phenotypes such as diseases. However, the growing sizes of these datasets and the quadratic, cubic or higher scaling characteristics of the relevant algorithms pose a serious computational challenge necessitating use of leadership scale computing. In this paper we describe a new approach to performing vector similarity metrics calculations, suitable for parallel systems equipped with graphics processing units (GPUs) or Intel Xeon Phi processors. Read More

We study the problem of testing conductance in the distributed computing model and give a two-sided tester that takes $\mathcal{O}(\log n)$ rounds to decide if a graph has conductance at least $\Phi$ or is $\epsilon$-far from having conductance at least $\Theta(\Phi^2)$ in the distributed CONGEST model. We also show that $\Omega(\log n)$ rounds are necessary for testing conductance even in the LOCAL model. In the case of a connected graph, we show that we can perform the test even when the number of vertices in the graph is not known a priori. Read More

New cloud programming and deployment models pose challenges to software application engineers who are looking, often in vain, for tools to automate any necessary code adaptation and transformation. Function-as-a-Service interfaces are particular non-trivial targets when considering that most cloud applications are implemented in non-functional languages. Among the most widely used of these languages is Python. Read More

A liquid system provides durable object storage based on spreading redundantly generated data across a network of hundreds to thousands of potentially unreliable storage nodes. A liquid system uses a combination of a large code, lazy repair, and a flow storage organization. We show that a liquid system can be operated to enable flexible and essentially optimal combinations of storage durability, storage overhead, repair bandwidth usage, and access performance. Read More

High network communication cost for synchronizing gradients and parameters is the well-known bottleneck of distributed training. In this work, we propose TernGrad that uses ternary gradients to accelerate distributed deep learning in data parallelism. Our approach requires only three numerical levels {-1,0,1} which can aggressively reduce the communication time. Read More

We study local symmetry breaking problems in the CONGEST model, focusing on ruling set problems, which generalize the fundamental Maximal Independent Set (MIS) problem. A $\beta$-ruling set is an independent set such that every node in the graph is at most $\beta$ hops from a node in the independent set. Our work is motivated by the following central question: can we break the $\Theta(\log n)$ time complexity barrier and the $\Theta(m)$ message complexity barrier in the CONGEST model for MIS or closely-related symmetry breaking problems? We present the following results: - Time Complexity: We show that we can break the $O(\log n)$ "barrier" for 2- and 3-ruling sets. Read More