Zhao Song

Zhao Song
Are you Zhao Song?

Claim your profile, edit publications, add additional information:

Contact Details

Zhao Song

Pubs By Year

Pub Categories

Computer Science - Data Structures and Algorithms (5)
Mathematics - Information Theory (3)
Computer Science - Information Theory (3)
Computer Science - Learning (2)
Computer Science - Computational Complexity (2)
Statistics - Machine Learning (1)

Publications Authored By Zhao Song

We consider relative error low rank approximation of {\it tensors} with respect to the Frobenius norm: given an order-$q$ tensor $A \in \mathbb{R}^{\prod_{i=1}^q n_i}$, output a rank-$k$ tensor $B$ for which $\|A-B\|_F^2 \leq (1+\epsilon)$OPT, where OPT $= \inf_{\textrm{rank-}k~A'} \|A-A'\|_F^2$. Despite the success on obtaining relative error low rank approximations for matrices, no such results were known for tensors. One structural issue is that there may be no rank-$k$ tensor $A_k$ achieving the above infinum. Read More

We study the $\ell_1$-low rank approximation problem, where for a given $n \times d$ matrix $A$ and approximation factor $\alpha \geq 1$, the goal is to output a rank-$k$ matrix $\widehat{A}$ for which $$\|A-\widehat{A}\|_1 \leq \alpha \cdot \min_{\textrm{rank-}k\textrm{ matrices}~A'}\|A-A'\|_1,$$ where for an $n \times d$ matrix $C$, we let $\|C\|_1 = \sum_{i=1}^n \sum_{j=1}^d |C_{i,j}|$. This error measure is known to be more robust than the Frobenius norm in the presence of outliers and is indicated in models where Gaussian assumptions on the noise may not apply. The problem was shown to be NP-hard by Gillis and Vavasis and a number of heuristics have been proposed. Read More

We consider the problem of estimating a Fourier-sparse signal from noisy samples, where the sampling is done over some interval $[0, T]$ and the frequencies can be "off-grid". Previous methods for this problem required the gap between frequencies to be above 1/T, the threshold required to robustly identify individual frequencies. We show the frequency gap is not necessary to estimate the signal as a whole: for arbitrary $k$-Fourier-sparse signals under $\ell_2$ bounded noise, we show how to estimate the signal with a constant factor growth of the noise and sample complexity polynomial in $k$ and logarithmic in the bandwidth and signal-to-noise ratio. Read More

In recent years, a number of works have studied methods for computing the Fourier transform in sublinear time if the output is sparse. Most of these have focused on the discrete setting, even though in many applications the input signal is continuous and naive discretization significantly worsens the sparsity level. We present an algorithm for robustly computing sparse Fourier transforms in the continuous setting. Read More

We present two improved algorithms for weighted discrete $p$-center problem for tree networks with $n$ vertices. One of our proposed algorithms runs in $O(n \log n + p \log^2 n \log(n/p))$ time. For all values of $p$, our algorithm thus runs as fast as or faster than the most efficient $O(n\log^2 n)$ time algorithm obtained by applying Cole's speed-up technique [cole1987] to the algorithm due to Megiddo and Tamir [megiddo1983], which has remained unchallenged for nearly 30 years. Read More

Consider a large database of $n$ data items that need to be stored using $m$ servers. We study how to encode information so that a large number $k$ of read requests can be performed in parallel while the rate remains constant (and ideally approaches one). This problem is equivalent to the design of multiset Batch Codes introduced by Ishai, Kushilevitz, Ostrovsky and Sahai [17]. Read More

In this work we present a flexible, probabilistic and reference-free method of error correction for high throughput DNA sequencing data. The key is to exploit the high coverage of sequencing data and model short sequence outputs as independent realizations of a Hidden Markov Model (HMM). We pose the problem of error correction of reads as one of maximum likelihood sequence detection over this HMM. Read More

We propose a Bayesian expectation-maximization (EM) algorithm for reconstructing Markov-tree sparse signals via belief propagation. The measurements follow an underdetermined linear model where the regression-coefficient vector is the sum of an unknown approximately sparse signal and a zero-mean white Gaussian noise with an unknown variance. The signal is composed of large- and small-magnitude components identified by binary state variables whose probabilistic dependence structure is described by a Markov tree. Read More