Hao Su - Stanford University

Hao Su
Are you Hao Su?

Claim your profile, edit publications, add additional information:

Contact Details

Name
Hao Su
Affiliation
Stanford University
City
Stanford
Country
United States

Pubs By Year

Pub Categories

 
Computer Science - Computer Vision and Pattern Recognition (14)
 
Statistics - Machine Learning (2)
 
Computer Science - Learning (2)
 
Computer Science - Robotics (2)
 
Computer Science - Artificial Intelligence (2)
 
Computer Science - Graphics (2)
 
Mathematics - Representation Theory (2)
 
Statistics - Computation (1)
 
Computer Science - Computational Geometry (1)

Publications Authored By Hao Su

We propose a method for converting geometric shapes into hierarchically segmented parts with part labels. Our key idea is to train category-specific models from the scene graphs and part names that accompany 3D shapes in public repositories. These freely-available annotations represent an enormous, untapped source of information on geometry. Read More

We consider the non-Lambertian object intrinsic problem of recovering diffuse albedo, shading, and specular highlights from a single image of an object. We build a large-scale object intrinsics database based on existing 3D models in the ShapeNet database. Rendered with realistic environment maps, millions of synthetic images of objects and their corresponding albedo, shading, and specular ground-truth images are used to train an encoder-decoder CNN. Read More

Important high-level vision tasks such as human-object interaction, image captioning and robotic manipulation require rich semantic descriptions of objects at part level. Based upon previous work on part localization, in this paper, we address the problem of inferring rich semantics imparted by an object part in still images. We propose to tokenize the semantic space as a discrete set of part states. Read More

Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. Read More

Generation of 3D data by deep neural network has been attracting increasing attention in the research community. The majority of extant works resort to regular representations such as volumetric grids or collection of images; however, these representations obscure the natural invariance of 3D shapes under geometric transformations and also suffer from a number of other issues. In this paper we address the problem of 3D reconstruction from a single image, generating a straight-forward form of output -- point cloud coordinates. Read More

In this paper, we study the problem of semantic annotation on 3D models that are represented as shape graphs. A functional view is taken to represent localized information on graphs, so that annotations such as part segment or keypoint are nothing but 0-1 indicator vertex functions. Compared with images that are 2D grids, shape graphs are irregular and non-isomorphic data structures. Read More

We present a learning framework for abstracting complex shapes by learning to assemble objects using 3D volumetric primitives. In addition to generating simple and geometrically interpretable explanations of 3D objects, our framework also allows us to automatically discover and exploit consistent structure in the data. We demonstrate that using our method allows predicting shape representations which can be leveraged for obtaining a consistent parsing across the instances of a shape collection and constructing an interpretable shape similarity measure. Read More

Given a simple-minded collection in the derived category of a non-positive dg algebra with finite-dimensional total cohomology, we construct a silting object via Koszul duality. Read More

Building discriminative representations for 3D data has been an important task in computer graphics and computer vision research. Convolutional Neural Networks (CNNs) have shown to operate on 2D images with great success for a variety of tasks. Lifting convolution operators to 3D (3DCNNs) seems like a plausible and promising next step. Read More

3D shape models are becoming widely available and easier to capture, making available 3D information crucial for progress in object classification. Current state-of-the-art methods rely on CNNs to address this problem. Recently, we witness two types of CNNs being developed: CNNs based upon volumetric representations versus CNNs based upon multi-view representations. Read More

Human 3D pose estimation from a single image is a challenging task with numerous applications. Convolutional Neural Networks (CNNs) have recently achieved superior performance on the task of 2D pose estimation from a single image, by training on images with 2D annotations collected by crowd sourcing. This suggests that similar success could be achieved for direct estimation of 3D poses. Read More

Let A be a path A-infinity-algebra over a positively graded quiver Q. It is proved that the derived category of A is triangulated equivalent to the derived category of kQ, which is viewed as a dg algebra with trivial differential. The main technique used in the proof is Koszul duality. Read More

We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Read More

Given i.i.d samples from some unknown continuous density on hyper-rectangle $[0, 1]^d$, we attempt to learn a piecewise constant function that approximates this underlying density non-parametrically. Read More

In this paper, we proposed a pose estimation system based on rendered image training set, which predicts the pose of objects in real image, with knowledge of object category and tight bounding box. We developed a patch-based multi-class classification algorithm, and an iterative approach to improve the accuracy. We achieved state-of-the-art performance on pose estimation task. Read More

Object viewpoint estimation from 2D images is an essential task in computer vision. However, two issues hinder its progress: scarcity of training data with viewpoint annotations, and a lack of powerful features. Inspired by the growing availability of 3D models, we propose a framework to address both issues by combining render-based image synthesis and CNNs. Read More

Comparing two images in a view-invariant way has been a challenging problem in computer vision for a long time, as visual features are not stable under large view point changes. In this paper, given a single input image of an object, we synthesize new features for other views of the same object. To accomplish this, we introduce an aligned set of 3D models in the same class as the input object image. Read More

Divergence is not only an important mathematical concept in information theory, but also applied to machine learning problems such as low-dimensional embedding, manifold learning, clustering, classification, and anomaly detection. We proposed a bayesian model---co-BPM---to characterize the discrepancy of two sample sets, i.e. Read More

The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. Read More

2012Jun
Affiliations: 1The University of Hong Kong, 2Stanford University, 3Stanford University

Using sparse-inducing norms to learn robust models has received increasing attention from many fields for its attractive properties. Projection-based methods have been widely applied to learning tasks constrained by such norms. As a key building block of these methods, an efficient operator for Euclidean projection onto the intersection of $\ell_1$ and $\ell_{1,q}$ norm balls $(q=2\text{or}\infty)$ is proposed in this paper. Read More