# Kenneth L. Clarkson

## Contact Details

NameKenneth L. Clarkson |
||

Affiliation |
||

Location |
||

## Pubs By Year |
||

## Pub CategoriesComputer Science - Data Structures and Algorithms (9) Computer Science - Computational Geometry (5) Computer Science - Learning (3) Computer Science - Numerical Analysis (2) Mathematics - Numerical Analysis (2) Statistics - Machine Learning (1) |

## Publications Authored By Kenneth L. Clarkson

Kernel Ridge Regression (KRR) is a simple yet powerful technique for non-parametric regression whose computation amounts to solving a linear system. This system is usually dense and highly ill-conditioned. In addition, the dimensions of the matrix are the same as the number of data points, so direct methods are unrealistic for large-scale datasets. Read More

The technique of matrix sketching, such as the use of random projections, has been shown in recent years to be a powerful tool for accelerating many important statistical learning techniques. Research has so far focused largely on using sketching for the "vanilla" un-regularized versions of these techniques. Here we study sketching methods for regularized variants of linear regression, low rank approximations, and canonical correlation analysis. Read More

In the subspace approximation problem, we seek a k-dimensional subspace F of R^d that minimizes the sum of p-th powers of Euclidean distances to a given set of n points a_1, ... Read More

Finding the coordinate-wise maxima and the convex hull of a planar point set are probably the most classic problems in computational geometry. We consider these problems in the self-improving setting. Here, we have $n$ distributions $\mathcal{D}_1, \ldots, \mathcal{D}_n$ of planar points. Read More

We design a new distribution over $\poly(r \eps^{-1}) \times n$ matrices $S$ so that for any fixed $n \times d$ matrix $A$ of rank $r$, with probability at least 9/10, $\norm{SAx}_2 = (1 \pm \eps)\norm{Ax}_2$ simultaneously for all $x \in \mathbb{R}^d$. Such a matrix $S$ is called a \emph{subspace embedding}. Furthermore, $SA$ can be computed in $\nnz(A) + \poly(d \eps^{-1})$ time, where $\nnz(A)$ is the number of non-zero entries of $A$. Read More

We provide fast algorithms for overconstrained $\ell_p$ regression and related problems: for an $n\times d$ input matrix $A$ and vector $b\in\mathbb{R}^n$, in $O(nd\log n)$ time we reduce the problem $\min_{x\in\mathbb{R}^d} \|Ax-b\|_p$ to the same problem with input matrix $\tilde A$ of dimension $s \times d$ and corresponding $\tilde b$ of dimension $s\times 1$. Here, $\tilde A$ and $\tilde b$ are a coreset for the problem, consisting of sampled and rescaled rows of $A$ and $b$; and $s$ is independent of $n$ and polynomial in $d$. Our results improve on the best previous algorithms when $n\gg d$, for all $p\in[1,\infty)$ except $p=2$. Read More

Computing the coordinate-wise maxima of a planar point set is a classic and well-studied problem in computational geometry. We give an algorithm for this problem in the \emph{self-improving setting}. We have $n$ (unknown) independent distributions $\cD_1, \cD_2, . Read More

We give sublinear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and L2-SVM, for which sublinear-time algorithms were not known before. These new algorithms use a combination of a novel sampling techniques and a new multiplicative update algorithm. Read More

We consider the set multi-cover problem in geometric settings. Given a set of points P and a collection of geometric shapes (or sets) F, we wish to find a minimum cardinality subset of F such that each point p in P is covered by (contained in) at least d(p) sets. Here d(p) is an integer demand (requirement) for p. Read More

We investigate ways in which an algorithm can improve its expected performance by fine-tuning itself automatically with respect to an unknown input distribution D. We assume here that D is of product type. More precisely, suppose that we need to process a sequence I_1, I_2, . Read More

Given a collection S of subsets of some set U, and M a subset of U, the set cover problem is to find the smallest subcollection C of S such that M is a subset of the union of the sets in C. While the general problem is NP-hard to solve, even approximately, here we consider some geometric special cases, where usually U = R^d. Extending prior results, we show that approximation algorithms with provable performance exist, under a certain general condition: that for a random subset R of S and function f(), there is a decomposition of the portion of U not covered by R into an expected f(|R|) regions, each region of a particular simple form. Read More