Xiang Yu - National University of Singapore

Xiang Yu
Are you Xiang Yu?

Claim your profile, edit publications, add additional information:

Contact Details

Xiang Yu
National University of Singapore

Pubs By Year

External Links

Pub Categories

Statistics - Machine Learning (14)
Computer Science - Learning (10)
Quantum Physics (7)
Computer Science - Computer Vision and Pattern Recognition (5)
Mathematics - Optimization and Control (5)
Mathematics - Probability (4)
Physics - Superconductivity (3)
Mathematics - Mathematical Physics (3)
Mathematical Physics (3)
Physics - Strongly Correlated Electrons (3)
Computer Science - Cryptography and Security (3)
Statistics - Applications (3)
Computer Science - Artificial Intelligence (2)
Physics - Materials Science (2)
Statistics - Methodology (1)
Computer Science - Numerical Analysis (1)
Physics - Soft Condensed Matter (1)
Physics - Computational Physics (1)
Computer Science - Computers and Society (1)
Mathematics - Dynamical Systems (1)
Mathematics - Information Theory (1)
Physics - Other (1)
Statistics - Theory (1)
Mathematics - Statistics (1)
Physics - Optics (1)
Physics - Classical Physics (1)
Physics - Instrumentation and Detectors (1)
Computer Science - Information Theory (1)
Physics - Mesoscopic Systems and Quantum Hall Effect (1)
Nuclear Experiment (1)

Publications Authored By Xiang Yu

Despite recent advances in face recognition using deep learning, severe accuracy drops are observed for large pose variations in unconstrained environments. Learning pose-invariant features is one solution, but needs expensively labeled large scale data and carefully designed feature learning algorithms. In this work, we focus on frontalizing faces in the wild under various head poses, including extreme profile views. Read More

Plumbene, similar to silicene, has a buckled honeycomb structure with a large band gap ($\sim 400$ meV). All previous studies have shown that it is a normal insulator. Here, we perform first-principles calculations and employ a sixteen-band tight-binding model with nearest-neighbor and next-nearest-neighbor hopping terms to investigate electronic structures and topological properties of the plumbene monolayer. Read More

Deep neural networks (DNNs) trained on large-scale datasets have recently achieved impressive improvements in face recognition. But a persistent challenge remains to develop methods capable of handling large pose variations that are relatively under-represented in training data. This paper presents a method for learning a feature representation that is invariant to pose, without requiring extensive pose coverage in training data. Read More

Monocular 3D object parsing is highly desirable in various scenarios including occlusion reasoning and holistic scene interpretation. We present a deep convolutional neural network (CNN) architecture to localize semantic parts in 2D image and 3D space while inferring their visibility states, given a single RGB image. Our key insight is to exploit domain knowledge to regularize the network by deeply supervising its hidden layers, in order to sequentially infer intermediate concepts associated with the final task. Read More

We consider the problem of off-policy evaluation---estimating the value of a target policy using data collected by another policy---under the contextual bandit model. We establish a minimax lower bound on the mean squared error (MSE), and show that it is matched up to constant factors by the inverse propensity scoring (IPS) estimator. Since in the multi-armed bandit problem the IPS is suboptimal (Li et. Read More

In this paper we describe an algorithm for predicting the websites at risk in a long range hacking activity, while jointly inferring the provenance and evolution of vulnerabilities on websites over continuous time. Specifically, we use hazard regression with a time-varying additive hazard function parameterized in a generalized linear form. The activation coefficients on each feature are continuous-time functions constrained with total variation penalty inspired by hacking campaigns. Read More

We combine fine-grained spatially referenced census data with the vote outcomes from the 2016 US presidential election. Using this dataset, we perform ecological inference using distribution regression (Flaxman et al, KDD 2015) with a multinomial-logit regression so as to model the vote outcome Trump, Clinton, Other / Didn't vote as a function of demographic and socioeconomic features. Ecological inference allows us to estimate "exit poll" style results like what was Trump's support among white women, but for entirely novel categories. Read More

In this paper we describe an algorithm for estimating the provenance of hacks on websites. That is, given properties of sites and the temporal occurrence of attacks, we are able to attribute individual attacks to joint causes and vulnerabilities, as well as estimating the evolution of these vulnerabilities over time. Specifically, we use hazard regression with a time-varying additive hazard function parameterized in a generalized linear form. Read More

Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace. In many practical scenarios, the dimensionality of data points to be clustered are compressed due to constraints of measurement, computation or privacy. In this paper, we study the theoretical properties of a popular subspace clustering algorithm named sparse subspace clustering (SSC) and establish formal success conditions of SSC on dimensionality-reduced data. Read More

We consider the problem of estimating a function defined over $n$ locations on a $d$-dimensional grid (having all side lengths equal to $n^{1/d}$). When the function is constrained to have discrete total variation bounded by $C_n$, we derive the minimax optimal (squared) $\ell_2$ estimation error rate, parametrized by $n$ and $C_n$. Total variation denoising, also known as the fused lasso, is seen to be rate optimal. Read More

We define On-Average KL-Privacy and present its properties and connections to differential privacy, generalization and information-theoretic quantities including max-information and mutual information. The new definition significantly weakens differential privacy, while preserving its minimalistic design features such as composition over small group and multiple queries as well as closeness to post-processing. Moreover, we show that On-Average KL-Privacy is **equivalent** to generalization for a large class of commonly-used tools in statistics and machine learning that samples from Gibbs distributions---a class of distributions that arises naturally from the maximum entropy principle. Read More

We propose a novel cascaded framework, namely deep deformation network (DDN), for localizing landmarks in non-rigid objects. The hallmarks of DDN are its incorporation of geometric constraints within a convolutional neural network (CNN) framework, ease and efficiency of training, as well as generality of application. A novel shape basis network (SBN) forms the first stage of the cascade, whereby landmarks are initialized by combining the benefits of CNN features and a learned shape basis to reduce the complexity of the highly nonlinear pose manifold. Read More

In adaptive data analysis, the user makes a sequence of queries on the data, where at each step the choice of query may depend on the results in previous steps. The releases are often randomized in order to reduce overfitting for such adaptively chosen queries. In this paper, we propose a minimax framework for adaptive data analysis. Read More

Direct state tomography (DST) using weak measurements has received wide attention. Based on the concept of coupling-deformed pointer observables presented by Zhang \emph{et al}.[Phys. Read More

In the optomechanical cooling of a dispersively coupled oscillator, it is only possible to reach the oscillator ground state in the resolved sideband regime, where the cavity-mode line width is smaller than the resonant frequency of the mechanical oscillator being cooled. In this paper, we show that the dispersively coupled system can be cooled to the ground state in the unresolved sideband regime using an ancillary oscillator, which is coupled to the same optical mode via dissipative interaction. The ancillary oscillator has a resonant frequency close to that of the target oscillator; thus, the ancillary oscillator is also in the unresolved sideband regime. Read More

In this letter, acoustic interaction between cascade sub-chambers is investigated by modelling the sound field in a silencer with cascade-connected sub-chambers using a sub-structuring technique. The contribution of the acoustic coupling to the net energy flow through each individual sub-chamber is derived quantitatively and the mechanism by which evanescence contributes to the sound transmission loss of the silencer is revealed. Read More

Time reversal (T) invariant topological insulator is widely recognized as one of the fundamental discoveries in condensed matter physics, for which the most fascinating hallmark is perhaps a spin based topological protection, the total cancellation of scattering of conduction electrons with certain spins on matter surface. Recently, it has created a paradigm shift for topological insulators, from electronics to photonics, phononics as well as mechanics, bringing about not only involved new physics but also potential applications in robust wave transport. Despite the growing interests in realizing topologically protected acoustic wave transport, T-invariant acoustic topological insulator has not yet been achieved. Read More

The influence of outside quantum noises on the amplification of weak measurements is investigated. Three typical quantum noises are discussed. The maximum values of the pointer's shifts decrease sharply with the strength of the depolarizing channel and phase damping. Read More

Differentially private collaborative filtering is a challenging task, both in terms of accuracy and speed. We present a simple algorithm that is provably differentially private, while offering good performance, using a novel connection of differential privacy to Bayesian posterior sampling via Stochastic Gradient Langevin Dynamics. Due to its simplicity the algorithm lends itself to efficient implementation. Read More

While the novel applications of weak values have recently attracted wide attention, weak measurement, the usual way to extract weak values, suffers from risky approximations and severe quantum noises. In this paper, we show that the weak-value information can be obtained exactly in strong measurement with postselections, via measuring the coupling-deformed pointer observables, i.e. Read More

In the readout electronics of the Water Cerenkov Detector Array (WCDA) in the Large High Altitude Air Shower Observatory (LHAASO) experiment, both high-resolution charge and time measurement are required over a dynamic range from 1 photoelectron (P.E.) to 4000 P. Read More

Subspace clustering is the problem of clustering data points into a union of low-dimensional linear/affine subspaces. It is the mathematical abstraction of many important problems in computer vision, image processing and machine learning. A line of recent work (4, 19, 24, 20) provided strong theoretical guarantee for sparse subspace clustering (4), the state-of-the-art algorithm for subspace clustering, on both noiseless and noisy data sets. Read More

This paper studies the utility maximization problem on the terminal wealth with both random endowments and proportional transaction costs. To deal with unbounded random payoffs from some illiquid claims, we propose to work with the acceptable portfolios defined via the consistent price system (CPS) such that the liquidation value processes stay above some stochastic thresholds. In the market consisting of one riskless bond and one risky asset, we obtain a type of the super-hedging result. Read More

Cities comprise various functional zones, including residential, educational, commercial zones, etc. It is important for urban planners to identify different functional zones and understand their spatial structure within the city in order to make better urban plans. In this research, we used 77976010 bus smart card records of Beijing City in one week in April 2008 and converted them into two-dimensional time series data of each bus platform, Then, through data mining in the big database system and previous studies on citizens' trip behavior, we established the DZoF (discovering zones of different functions) model based on SCD (smart card Data) and POIs (points of interest), and pooled the results at the TAZ (traffic analysis zone) level. Read More

We consider the problem of Bayesian learning on sensitive datasets and present two simple but somewhat surprising results that connect Bayesian learning to "differential privacy:, a cryptographic approach to protect individual-level privacy while permiting database-level utility. Specifically, we show that that under standard assumptions, getting one single sample from a posterior distribution is differentially private "for free". We will see that estimator is statistically consistent, near optimal and computationally tractable whenever the Bayesian model of interest is consistent, optimal and tractable. Read More

While machine learning has proven to be a powerful data-driven solution to many real-life problems, its use in sensitive domains has been limited due to privacy concerns. A popular approach known as **differential privacy** offers provable privacy guarantees, but it is often observed in practice that it could substantially hamper learning accuracy. In this paper we study the learnability (whether a problem can be learned by any algorithm) under Vapnik's general learning setting with differential privacy constraint, and reveal some intricate relationships between privacy, stability and learnability. Read More

The uncertainty principle is often interpreted by the tradeoff between the error of a measurement and the consequential disturbance to the followed ones, which originated long ago from Heisenberg himself but now falls into reexamination and even heated debate. Here we show that the tradeoff is switched on or off by the quantum uncertainties of two involved non-commuting observables: if one is more certain than the other, there is no tradeoff; otherwise, they do have tradeoff and the Jensen-Shannon divergence gives it a good characterization. Read More

We introduce a family of adaptive estimators on graphs, based on penalizing the $\ell_1$ norm of discrete graph differences. This generalizes the idea of trend filtering [Kim et al. (2009), Tibshirani (2014)], used for univariate nonparametric regression, to graphs. Read More

We develop parallel and distributed Frank-Wolfe algorithms; the former on shared memory machines with mini-batching, and the latter in a delayed update framework. Whenever possible, we perform computations asynchronously, which helps attain speedups on multicore machines as well as in distributed environments. Moreover, instead of worst-case bounded delays, our methods only depend (mildly) on \emph{expected} delays, allowing them to be robust to stragglers and faulty worker threads. Read More

This paper studies the optimal consumption under the addictive habit formation preference in markets with transaction costs and unbounded random endowments. To model the proportional transaction costs, we adopt the Kabanov's multi-asset framework with a cash account. At the terminal time T, the investor can receive unbounded random endowments for which we propose a new definition of acceptable portfolios based on the strictly consistent price system (SCPS). Read More

We present the orbital resolved electronic properties of structurally distorted 1T-TaS2 monolayers. After optimizing the crystal structures, we obtain the lattice parameters and atomic positions in the star-of-David structure, and show the low-temperature band structures of distorted bulk are consistent with recent angle resolved photoemission spectroscopy (ARPES) data. We further clearly demonstrate that $5d$ electrons of Ta form ordered orbital-density-wave (ODW) state with dominant $5d_{3{z}^2-{r}^2}$ character in central Ta, driving the one-dimensional metallic state in paramagnetic bulk and half-filled insulator in monolayer. Read More

In this paper, we study the necessary conditions and sufficient conditions for the central configurations formed by two twisted regular polygons (one N-regular polygon and one L-regular polygon). We wish to extend the results of the symmetrical central configurations formed by two twisted N-regular polygons, however, it will be proved that there are not more central configurations in a more general setting than the central configurations considered for some more particular situations before. Read More

We study a novel spline-like basis, which we name the "falling factorial basis", bearing many similarities to the classic truncated power basis. The advantage of the falling factorial basis is that it enables rapid, linear-time computations in basis matrix multiplication and basis matrix inversion. The falling factorial functions are not actually splines, but are close enough to splines that they provably retain some of the favorable properties of the latter functions. Read More

The electronic structure and magnetism of LiFeO$_{2}$Fe$_{2}$Se$_{2}$ are investigated using the first-principle calculations. The ground state is N$\acute{e}$el antiferromagnetic (AFM) Mott insulating state for Fe1 with localized magnetism in LiFeO$_{2}$ layer and striped AFM metallic state for Fe2 with itinerant magnetism in Fe$_{2}$Se$_{2}$ layer, accompanied with a weak interlayer AFM coupling between Fe1 and Fe2 ions, resulting in a coexistence of localized and itinerant magnetism. Moreover, the layered LiFeO$_{2}$ is found to be more than an insulating block layer but responsible for enhanced AFM correlation in Fe$_{2}$Se$_{2}$ layer through the interlayer magnetic coupling. Read More

The electronic and magnetic properties of BaTi$_{2}$As$_{2}$O have been investigated using both the first-principles and analytical methods. The full-potential linearized augmented plane-wave calculations show that the most stable state is a site-selective antiferromagnetic (AFM) metal with a $\text{2}\times \text{1}\times \text{1}$ magnetic unit cell containing two nonmagnetic Ti atoms and two other Ti atoms with antiparallel moments. Further analysis to Fermi surface and spin susceptibility shows that the site-selective AFM ground state is driven by the Fermi surface nesting and the Coulomb correlation. Read More

When we use variational methods to study the Newtonian $N$-body problem, the main problem is how to avoid collisions. C.Marchal got a remarkable result, that is, a path minimizing the Lagrangian action functional between two given configurations is always a true (collision-free) solution, so long as the dimension $d$ of physical space $\mathbb{R}^d$ satisfies $d\geq2$. Read More

This paper studies the market viability with proportional transaction costs. Instead of requiring the existence of strictly consistent price systems (SCPS) as in the literature, we show that strictly consistent local martingale systems (SCLMS) can successfully serve as the dual elements such that the market viability can be verified. We introduce two weaker notions of no arbitrage conditions on market models named no unbounded profit with bounded risk (NUPBR) and no local arbitrage with bounded portfolios (NLABP). Read More

As the method to completely characterize quantum dynamical processes, quantum process tomography (QPT) is vitally important for quantum information processing and quantum control, where the faithfulness of quantum devices plays an essential role. Here via weak measurements, we present a new QPT scheme characterized by its directness and parallelism. Comparing with the existing schemes, our scheme needs a simpler state preparation and much fewer experimental setups. Read More

Low-rank matrix completion is a problem of immense practical importance. Recent works on the subject often use nuclear norm as a convex surrogate of the rank function. Despite its solid theoretical foundation, the convex version of the problem often fails to work satisfactorily in real-life applications. Read More

This paper considers the problem of subspace clustering under noise. Specifically, we study the behavior of Sparse Subspace Clustering (SSC) when either adversarial or random noise is added to the unlabelled input data points, which are assumed to be in a union of low-dimensional subspaces. We show that a modified version of SSC is \emph{provably effective} in correctly identifying the underlying subspaces, even with noisy data. Read More

By using an arithmetic fact, we will firstly prove Saari's conjecture in a particular case, which is called the Elliptical Type N-Body Problem, and then we apply it to prove that the variational minimal solution of the planar Newtonian N-body problem is precisely a relative equilibrium solution whose configuration minimizes the function $IU^2$, it's worth noticing that we don't need the hypothesis of Finiteness of Central Configurations. In the Planetary Restricted Problem (which ignore all the mutual gravitational interactions between the planets), the corresponding Saari's conjecture is stated and proved. Read More

In this paper we analyze the groundstate and finite-temperature properties of a frustrated Heisenberg $J_1-J_2$ model on a honeycomb lattice by employing the Schwinger boson technique. The phase diagram and spin gap as functions of ${J}_{2}/{J}_{1}$ are presented, showing that the exotic spin liquid phase lies in $0.21<{J}_{2}/{J}_{1} <0. Read More

In this paper we investigate the electronic and magnetic properties of K$_{x}$Fe$_{2-y}$Se$_{2}$ materials at different band fillings utilizing the multi-orbital Kotliar-Ruckenstein's slave-boson mean field approach. We find that at three-quarter filling, corresponding to KFe$_{2}$Se$_{2}$, the ground state is a paramagnetic bad metal. Through band renormalization analysis and comparison with the angle-resolved photoemission spectra data, we identify that KFe$_{2}$Se$_{2}$ is also an intermediate correlated system, similar to iron-pnictide systems. Read More


We study the stability vis a vis adversarial noise of matrix factorization algorithm for matrix completion. In particular, our results include: (I) we bound the gap between the solution matrix of the factorization method and the ground truth in terms of root mean square error; (II) we treat the matrix factorization as a subspace fitting problem and analyze the difference between the solution subspace and the ground truth; (III) we analyze the prediction error of individual users based on the subspace stability. We apply these results to the problem of collaborative filtering under manipulator attack, which leads to useful insights and guidelines for collaborative filtering system design. Read More

We study the precise phase estimation using squeezed states with photon losses present. Our exact quantum Fisher information calculation shows significant quantum enhancement and thus reveals the benchmark for practical quantum metrology in this noisy scenario. However, we find that the existing parity measurement scheme [P. Read More

We consider a model of optimal investment and consumption with both habit formation and partial observations in incomplete It\^{o} processes market. The investor chooses his consumption under the addictive habits constraint while only observing the market stock prices but not the instantaneous rate of return. Applying the Kalman-Bucy filtering theorem and the Dynamic Programming arguments, we solve the associated Hamilton-Jacobi-Bellman (HJB) equation explicitly for the path dependent stochastic control problem in the case of power utilities. Read More

This paper studies the continuous time utility maximization problem on consumption with addictive habit formation in incomplete semimartingale markets. Introducing the set of auxiliary state processes and the modified dual space, we embed our original problem into a time-separable utility maximization problem with a shadow random endowment on the product space $\mathbb{L}_+^0(\Omega\times [0,T],\mathcal{O},\overline{\mathbb{P}})$. Existence and uniqueness of the optimal solution are established using convex duality approach, where the primal value function is defined on two variables, that is, the initial wealth and the initial standard of living. Read More

Molecular dynamics (MD) simulation is a powerful computational tool to study the behavior of macromolecular systems. But many simulations of this field are limited in spatial or temporal scale by the available computational resource. In recent years, graphics processing unit (GPU) provides unprecedented computational power for scientific applications. Read More