# Embeddings of Schatten Norms with Applications to Data Streams

Given an $n \times d$ matrix $A$, its Schatten-$p$ norm, $p \geq 1$, is defined as $\|A\|_p = \left (\sum_{i=1}^{\textrm{rank}(A)}\sigma_i(A)^p \right )^{1/p}$, where $\sigma_i(A)$ is the $i$-th largest singular value of $A$. These norms have been studied in functional analysis in the context of non-commutative $\ell_p$-spaces, and recently in data stream and linear sketching models of computation. Basic questions on the relations between these norms, such as their embeddability, are still open. Specifically, given a set of matrices $A^1, \ldots, A^{\operatorname{poly}(nd)} \in \mathbb{R}^{n \times d}$, suppose we want to construct a linear map $L$ such that $L(A^i) \in \mathbb{R}^{n' \times d'}$ for each $i$, where $n' \leq n$ and $d' \leq d$, and further, $\|A^i\|_p \leq \|L(A^i)\|_q \leq D_{p,q} \|A^i\|_p$ for a given approximation factor $D_{p,q}$ and real number $q \geq 1$. Then how large do $n'$ and $d'$ need to be as a function of $D_{p,q}$? We nearly resolve this question for every $p, q \geq 1$, for the case where $L(A^i)$ can be expressed as $R \cdot A^i \cdot S$, where $R$ and $S$ are arbitrary matrices that are allowed to depend on $A^1, \ldots, A^t$, that is, $L(A^i)$ can be implemented by left and right matrix multiplication. Namely, for every $p, q \geq 1$, we provide nearly matching upper and lower bounds on the size of $n'$ and $d'$ as a function of $D_{p,q}$. Importantly, our upper bounds are {\it oblivious}, meaning that $R$ and $S$ do not depend on the $A^i$, while our lower bounds hold even if $R$ and $S$ depend on the $A^i$. As an application of our upper bounds, we answer a recent open question of Blasiok et al. about space-approximation trade-offs for the Schatten $1$-norm, showing in a data stream it is possible to estimate the Schatten-$1$ norm up to a factor of $D \geq 1$ using $\tilde{O}(\min(n,d)^2/D^4)$ space.

## Similar Publications

Randomized binary exponential backoff (BEB) is a popular algorithm for coordinating access to a shared channel. With an operational history exceeding four decades, BEB is currently an important component of several wireless standards. Despite this track record, prior theoretical results indicate that under bursty traffic (1) BEB yields poor makespan and (2) superior algorithms are possible. Read More

An $(r, \ell)$-partition of a graph $G$ is a partition of its vertex set into $r$ independent sets and $\ell$ cliques. A graph is $(r, \ell)$ if it admits an $(r, \ell)$-partition. A graph is well-covered if every maximal independent set is also maximum. Read More

Triangle-free graphs play a central role in graph theory, and triangle detection (or triangle finding) as well as triangle enumeration (triangle listing) play central roles in the field of graph algorithms. In distributed computing, algorithms with sublinear round complexity for triangle finding and listing have recently been developed in the powerful CONGEST clique model, where communication is allowed between any two nodes of the network. In this paper we present the first algorithms with sublinear complexity for triangle finding and triangle listing in the standard CONGEST model, where the communication topology is the same as the topology of the network. Read More

This paper formulates a novel problem on graphs: find the minimal subset of edges in a fully connected graph, such that the resulting graph contains all spanning trees for a set of specifed sub-graphs. This formulation is motivated by an un-supervised grammar induction problem from computational linguistics. We present a reduction to some known problems and algorithms from graph theory, provide computational complexity results, and describe an approximation algorithm. Read More

In this work, we provide a general framework for adding a linearizable iterator to data structures with set operations. We propose a condition on these set operations, called locality, so that any data structure implemented from local atomic operations can be augmented with a linearizable iterator as described by our framework. We then apply the iterator framework to various data structures, prove locality of their operations, and demonstrate that the iterator framework does not significantly affect the performance of concurrent operations. Read More

In the k-partition problem (k-PP), one is given an edge-weighted undirected graph, and one must partition the node set into at most k subsets, in order to minimise (or maximise) the total weight of the edges that have their end-nodes in the same cluster. Various hierarchical variants of this problem have been studied in the context of data mining. We consider a 'two-level' variant that arises in mobile wireless communications. Read More

We introduce a technique to turn dynamic programming tasks on labeled directed acyclic graphs (labeled DAGs) closer to their sequence analogies. We demonstrate the technique on three problems: longest increasing subsequence, longest common subsequence, and co-linear chaining extended to labeled DAGs. For the former we obtain an algorithm with running time $O(|E| k \log |V|+ |V| k \log^2 |V|)$, where $V$ and $E$ are the set of vertices and edges, respectively, and $k$ is the minimum size of a path cover of $V$. Read More

We show that Boolean matrix multiplication, computed as a sum of products of column vectors with row vectors, is essentially the same as Warshall's algorithm for computing the transitive closure matrix of a graph from its adjacency matrix. Warshall's algorithm can be generalized to Floyd's algorithm for computing the distance matrix of a graph with weighted edges. We will generalize Boolean matrices in the same way, keeping matrix multiplication essentially equivalent to the Floyd-Warshall algorithm. Read More

Integer Linear Programming is a famous NP-complete problem. Lenstra showed that in the case of small dimension, it can be solved in polynomial time. This algorithm became a ubiquitous tool, especially in the design of parameterized algorithms for NP-complete problems, where we wish to isolate the hardness of an instance to some parameter. Read More

Given a string $T$, it is known that its suffix tree can be represented using the compact directed acyclic word graph (CDAWG) with $e_T$ arcs, taking overall $O(e_T+e_{{\overline{T}}})$ words of space, where ${\overline{T}}$ is the reverse of $T$, and supporting some key operations in time between $O(1)$ and $O(\log{\log{n}})$ in the worst case. This representation is especially appealing for highly repetitive strings, like collections of similar genomes or of version-controlled documents, in which $e_T$ grows sublinearly in the length of $T$ in practice. In this paper we augment such representation, supporting a number of additional queries in worst-case time between $O(1)$ and $O(\log{n})$ in the RAM model, without increasing space complexity asymptotically. Read More