Friend or Foe? Population Protocols can perform Community Detection

We present a simple distributed algorithm that, given a regular graph consisting of two communities (or clusters), each inducing a good expander and such that the cut between them has sparsity $1/\mbox{polylog}(n)$, recovers the two communities. More precisely, upon running the protocol, every node assigns itself a binary label of $m = \Theta(\log n)$ bits, so that with high probability, for all but a small number of outliers, nodes within the same community are assigned labels with Hamming distance $o(m)$, while nodes belonging to different communities receive labels with Hamming distance at least $m/2 - o(m)$. We refer to such an outcome as a "community sensitive labeling" of the graph. Our algorithm uses $\Theta(\log^2 n)$ local memory and computes the community sensitive labeling after each node performs $\Theta(\log^2 n)$ steps of local work. Our algorithm and its analysis work in the "(random) population protocol" model, in which anonymous nodes do not share any global clock (the model is asynchronous) and communication occurs over one single (random) edge per round. We believe, this is the first provably-effective protocol for community detection that works in this model.

Comments: 26 pages

Similar Publications

Dimension is a standard and well-studied measure of complexity of posets. Recent research has provided many new upper bounds on the dimension for various structurally restricted classes of posets. Bounded dimension gives a succinct representation of the poset, admitting constant response time for queries of the form "is $xRead More


Quantum walks have received a great deal of attention recently because they can be used to develop new quantum algorithms and to simulate interesting quantum systems. In this work, we focus on a model called staggered quantum walk, which employs advanced ideas of graph theory and has the advantage of including the most important instances of other discrete-time models. The evolution operator of the staggered model is obtained from a tessellation cover, which is defined in terms of a set of partitions of the graph into cliques. Read More


This paper formulates a novel problem on graphs: find the minimal subset of edges in a fully connected graph, such that the resulting graph contains all spanning trees for a set of specifed sub-graphs. This formulation is motivated by an un-supervised grammar induction problem from computational linguistics. We present a reduction to some known problems and algorithms from graph theory, provide computational complexity results, and describe an approximation algorithm. Read More


We investigate the expected distance to the power $b$ between two identical general random processes As an application to sensor network we derive the optimal transportation cost to the power $b>0$ of the maximal random bicolored matching. Read More


We show that Boolean matrix multiplication, computed as a sum of products of column vectors with row vectors, is essentially the same as Warshall's algorithm for computing the transitive closure matrix of a graph from its adjacency matrix. Warshall's algorithm can be generalized to Floyd's algorithm for computing the distance matrix of a graph with weighted edges. We will generalize Boolean matrices in the same way, keeping matrix multiplication essentially equivalent to the Floyd-Warshall algorithm. Read More


We introduce a formal definition of a pattern poset which encompasses several previously studied posets in the literature. Using this definition we present some general results on the M\"obius function and topology of such pattern posets. We prove our results using a poset fibration based on the embeddings of the poset, where embeddings are representations of occurrences. Read More


In combinatorial group testing problems Questioner needs to find a defective element $x\in [n]$ by testing subsets of $[n]$. In [18] the authors introduced a new model, where each element knows the answer for those queries that contain it and each element should be able to identify the defective one. In this article we continue to investigate this kind of models with more defective elements. Read More


Let $G$ be an undirected graph. An edge of $G$ dominates itself and all edges adjacent to it. A subset $E'$ of edges of $G$ is an edge dominating set of $G$, if every edge of the graph is dominated by some edge of $E'$. Read More


This paper is about counting the number of distinct (scattered) subwords occurring in a given word. More precisely, we consider the generalization of the Pascal triangle to binomial coefficients of words and the sequence $(S(n))_{n\ge 0}$ counting the number of positive entries on each row. By introducing a convenient tree structure, we provide a recurrence relation for $(S(n))_{n\ge 0}$. Read More


Many digital functions studied in the literature, e.g., the summatory function of the base-$k$ sum-of-digits function, have a behavior showing some periodic fluctuation. Read More