# Amirali Abdullah

## Contact Details

NameAmirali Abdullah |
||

Affiliation |
||

Location |
||

## Pubs By Year |
||

## Pub CategoriesComputer Science - Data Structures and Algorithms (5) Computer Science - Computational Geometry (4) Computer Science - Computational Complexity (1) Computer Science - Information Theory (1) Mathematics - Information Theory (1) |

## Publications Authored By Amirali Abdullah

Streaming interactive proofs (SIPs) are a framework for outsourced computation. A computationally limited streaming client (the verifier) hands over a large data set to an untrusted server (the prover) in the cloud and the two parties run a protocol to confirm the correctness of result with high probability. SIPs are particularly interesting for problems that are hard to solve (or even approximate) well in a streaming setting. Read More

Big data is becoming ever more ubiquitous, ranging over massive video repositories, document corpuses, image sets and Internet routing history. Proximity search and clustering are two algorithmic primitives fundamental to data analysis, but suffer from the "curse of dimensionality" on these gigantic datasets. A popular attack for this problem is to convert object representations into short binary codewords, while approximately preserving near neighbor structure. Read More

Information distances like the Hellinger distance and the Jensen-Shannon divergence have deep roots in information theory and machine learning. They are used extensively in data analysis especially when the objects being compared are high dimensional empirical probability distributions built from data. However, we lack common tools needed to actually use information distances in applications efficiently and at scale with any kind of provable guarantees. Read More

We study spectral algorithms for the high-dimensional Nearest Neighbor Search problem (NNS). In particular, we consider a semi-random setting where a dataset $P$ in $\mathbb{R}^d$ is chosen arbitrarily from an unknown subspace of low dimension $k\ll d$, and then perturbed by fully $d$-dimensional Gaussian noise. We design spectral NNS algorithms whose query time depends polynomially on $d$ and $\log n$ (where $n=|P|$) for large ranges of $k$, $d$ and $n$. Read More

Bregman divergences $D_\phi$ are a class of divergences parametrized by a convex function $\phi$ and include well known distance functions like $\ell_2^2$ and the Kullback-Leibler divergence. There has been extensive research on algorithms for problems like clustering and near neighbor search with respect to Bregman divergences, in all cases, the algorithms depend not just on the data size $n$ and dimensionality $d$, but also on a structure constant $\mu \ge 1$ that depends solely on $\phi$ and can grow without bound independently. In this paper, we provide the first evidence that this dependence on $\mu$ might be intrinsic. Read More

We study coresets for various types of range counting queries on uncertain data. In our model each uncertain point has a probability density describing its location, sometimes defined as k distinct locations. Our goal is to construct a subset of the uncertain points, including their locational uncertainty, so that range counting queries can be answered by just examining this subset. Read More

In this paper we present the first provable approximate nearest-neighbor (ANN) algorithms for Bregman divergences. Our first algorithm processes queries in O(log^d n) time using O(n log^d n) space and only uses general properties of the underlying distance function (which includes Bregman divergences as a special case). The second algorithm processes queries in O(log n) time using O(n) space and exploits structural constants associated specifically with Bregman divergences. Read More