Sushant Sachdeva

Sushant Sachdeva
Are you Sushant Sachdeva?

Claim your profile, edit publications, add additional information:

Contact Details

Sushant Sachdeva

Pubs By Year

Pub Categories

Computer Science - Data Structures and Algorithms (15)
Computer Science - Discrete Mathematics (4)
Computer Science - Learning (4)
Mathematics - Combinatorics (3)
Mathematics - Numerical Analysis (2)
Mathematics - Classical Analysis and ODEs (2)
Computer Science - Computational Complexity (2)
Computer Science - Numerical Analysis (2)
Physics - Physics and Society (1)
Mathematics - Metric Geometry (1)
Physics - Data Analysis; Statistics and Probability (1)
Mathematics - Optimization and Control (1)
Mathematics - Statistics (1)
Statistics - Theory (1)

Publications Authored By Sushant Sachdeva

We study the efficacy of learning neural networks with neural networks by the (stochastic) gradient descent method. While gradient descent enjoys empirical success in a variety of applications, there is a lack of theoretical guarantees that explains the practical utility of deep learning. We focus on two-layer neural networks with a linear activation on the output node. Read More

We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in $\tilde{O}(n^{5/3 }m^{1/3})$ time (The $\tilde{O}(\cdot)$ notation hides $\operatorname{polylog}(n)$ factors). The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. Read More

A spectral sparsifier of a graph $G$ is a sparser graph $H$ that approximately preserves the quadratic form of $G$, i.e. for all vectors $x$, $x^T L_G x \approx x^T L_H x$, where $L_G$ and $L_H$ denote the respective graph Laplacians. Read More

We show how to perform sparse approximate Gaussian elimination for Laplacian matrices. We present a simple, nearly linear time algorithm that approximates a Laplacian by a matrix with a sparse Cholesky factorization, the version of Gaussian elimination for symmetric matrices. This is the first nearly linear time solver for Laplacian systems that is based purely on random sampling, and does not use any graph theoretic constructions such as low-stretch trees, sparsifiers, or expanders. Read More

We introduce the sparsified Cholesky and sparsified multigrid algorithms for solving systems of linear equations. These algorithms accelerate Gaussian elimination by sparsifying the nonzero matrix entries created by the elimination process. We use these new algorithms to derive the first nearly linear time algorithms for solving systems of equations in connection Laplacians, a generalization of Laplacian matrices that arise in many problems in image and signal processing. Read More

We study the mixing time of the Dikin walk in a polytope - a random walk based on the log-barrier from the interior point method literature. This walk, and a close variant, were studied by Narayanan (2016) and Kannan-Narayanan (2012). Bounds on its mixing time are important for algorithms for sampling and optimization over polytopes. Read More

Given a directed acyclic graph $G,$ and a set of values $y$ on the vertices, the Isotonic Regression of $y$ is a vector $x$ that respects the partial order described by $G,$ and minimizes $||x-y||,$ for a specified norm. This paper gives improved algorithms for computing the Isotonic Regression for all weighted $\ell_{p}$-norms with rigorous performance guarantees. Our algorithms are quite practical, and their variants can be implemented to run fast in practice. Read More

We develop fast algorithms for solving regression problems on graphs where one is given the value of a function at some vertices, and must find its smoothest possible extension to all vertices. The extension we compute is the absolutely minimal Lipschitz extension, and is the limit for large $p$ of $p$-Laplacian regularization. We present an algorithm that computes a minimal Lipschitz extension in expected linear time, and an algorithm that computes an absolutely minimal Lipschitz extension in expected time $\widetilde{O} (m n)$. Read More

Given $k$ collections of 2SAT clauses on the same set of variables $V$, can we find one assignment that satisfies a large fraction of clauses from each collection? We consider such simultaneous constraint satisfaction problems, and design the first nontrivial approximation algorithms in this context. Our main result is that for every CSP $F$, for $k < \tilde{O}(\log^{1/4} n)$, there is a polynomial time constant factor Pareto approximation algorithm for $k$ simultaneous Max-$F$-CSP instances. Our methods are quite general, and we also use them to give an improved approximation factor for simultaneous Max-w-SAT (for $k <\tilde{O}(\log^{1/3} n)$). Read More

We survey key techniques and results from approximation theory in the context of uniform approximations to real functions such as e^{-x}, 1/x, and x^k. We then present a selection of results demonstrating how such approximations can be used to speed up primitives crucial for the design of fast algorithms for problems such as simulating random walks, graph partitioning, solving linear system of equations, computing eigenvalues and combinatorial approaches to solve semi-definite programs. Read More

We prove that the inverse of a positive-definite matrix can be approximated by a weighted-sum of a small number of matrix exponentials. Combining this with a previous result [OSV12], we establish an equivalence between matrix inversion and exponentiation up to polylogarithmic factors. In particular, this connection justifies the use of Laplacian solvers for designing fast semi-definite programming based algorithms for certain graph problems. Read More

We give an arithmetic version of the recent proof of the triangle removal lemma by Fox [Fox11], for the group $\mathbb{F}_2^n$. A triangle in $\mathbb{F}_2^n$ is a triple $(x,y,z)$ such that $x+y+z = 0$. The triangle removal lemma for $\mathbb{F}_2^n$ states that for every $\epsilon > 0$ there is a $\delta > 0$, such that if a subset $A$ of $\mathbb{F}_2^n$ requires the removal of at least $\epsilon \cdot 2^n$ elements to make it triangle-free, then it must contain at least $\delta \cdot 2^{2n}$ triangles. Read More

Suppose we are given an oracle that claims to approximate the permanent for most matrices X, where X is chosen from the Gaussian ensemble (the matrix entries are i.i.d. Read More

We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees. In particular, suppose we are given samples of the form $y = Ax + \eta$ where $A$ is an unknown $n \times n$ matrix and $x$ is a random variable whose components are independent and have a fourth moment strictly less than that of a standard Gaussian random variable and $\eta$ is an $n$-dimensional Gaussian random variable with unknown covariance $\Sigma$: We give an algorithm that provable recovers $A$ and $\Sigma$ up to an additive $\epsilon$ and whose running time and sample complexity are polynomial in $n$ and $1 / \epsilon$. To accomplish this, we introduce a novel "quasi-whitening" step that may be useful in other contexts in which the covariance of Gaussian noise is not known in advance. Read More

A "community" in a social network is usually understood to be a group of nodes more densely connected with each other than with the rest of the network. This is an important concept in most domains where networks arise: social, technological, biological, etc. For many years algorithms for finding communities implicitly assumed communities are nonoverlapping (leading to use of clustering-based approaches) but there is increasing interest in finding overlapping communities. Read More

We give a novel spectral approximation algorithm for the balanced separator problem that, given a graph G, a constant balance b \in (0,1/2], and a parameter \gamma, either finds an \Omega(b)-balanced cut of conductance O(\sqrt(\gamma)) in G, or outputs a certificate that all b-balanced cuts in G have conductance at least \gamma, and runs in time \tilde{O}(m). This settles the question of designing asymptotically optimal spectral algorithms for balanced separator. Our algorithm relies on a variant of the heat kernel random walk and requires, as a subroutine, an algorithm to compute \exp(-L)v where L is the Laplacian of a graph related to G and v is a vector. Read More

We study the problem of computing the minimum vertex cover on k-uniform k-partite hypergraphs when the k-partition is given. On bipartite graphs (k = 2), the minimum vertex cover can be computed in polynomial time. For general k, the problem was studied by Lov\'asz, who gave a k/2 -approximation based on the standard LP relaxation. Read More

The k-fold Cartesian product of a graph G is defined as a graph on k-tuples of vertices, where two tuples are connected if they form an edge in one of the positions and are equal in the rest. Starting with G as a single edge gives G^k as a k-dimensional hypercube. We study the distributions of edges crossed by a cut in G^k across the copies of G in different positions. Read More

In a well-known paper[ARV], Arora, Rao and Vazirani obtained an O(sqrt(log n)) approximation to the Balanced Separator problem and Uniform Sparsest Cut. At the heart of their result is a geometric statement about sets of points that satisfy triangle inequalities, which also underlies subsequent work on approximation algorithms and geometric embeddings. In this note, we give an equivalent formulation of the Structure theorem in [ARV] in terms of the expansion of large sets in geometric graphs on sets of points satisfying triangle inequalities. Read More