Learning Data-driven Reflectance Priors for Intrinsic Image Decomposition

We propose a data-driven approach for intrinsic image decomposition, which is the process of inferring the confounding factors of reflectance and shading in an image. We pose this as a two-stage learning problem. First, we train a model to predict relative reflectance ordering between image patches (`brighter', `darker', `same') from large-scale human annotations, producing a data-driven reflectance prior. Second, we show how to naturally integrate this learned prior into existing energy minimization frameworks for intrinsic image decomposition. We compare our method to the state-of-the-art approach of Bell et al. on both decomposition and image relighting tasks, demonstrating the benefits of the simple relative reflectance prior, especially for scenes under challenging lighting conditions.

Comments: International Conference on Computer Vision (ICCV) 2015

Similar Publications

Numerous deep learning applications benefit from multi-task learning with multiple regression and classification objectives. In this paper we make the observation that the performance of such systems is strongly dependent on the relative weighting between each task's loss. Tuning these weights by hand is a difficult and expensive process, making multi-task learning prohibitive in practice. Read More

Computational photography encompasses a diversity of imaging techniques, but one of the core operations performed by many of them is to compute image differences. An intuitive approach to computing such differences is to capture several images sequentially and then process them jointly. Usually, this approach leads to artifacts when recording dynamic scenes. Read More

Given a large unlabeled set of images, how to efficiently and effectively group them into clusters based on the extracted visual representations remains a challenging problem. To address this problem, we propose a convolutional neural network (CNN) to jointly solve clustering and representation learning in an iterative manner. In the proposed method, given an input image set, we first randomly pick k samples and extract their features as initial cluster centroids using the proposed CNN with an initial model pre-trained from the ImageNet dataset. Read More

Cellular Automata (CA) theory is a discrete model that represents the state of each of its cells from a finite set of possible values which evolve in time according to a pre-defined set of transition rules. CA have been applied to a number of image processing tasks such as Convex Hull Detection, Image Denoising etc. but mostly under the limitation of restricting the input to binary images. Read More

Recently, deep convolutional neural network (DCNN) achieved increasingly remarkable success and rapidly developed in the field of natural image recognition. Compared with the natural image, the scale of remote sensing image is larger and the scene and the object it represents are more macroscopic. This study inquires whether remote sensing scene and natural scene recognitions differ and raises the following questions: What are the key factors in remote sensing scene recognition? Is the DCNN recognition mechanism centered on object recognition still applicable to the scenarios of remote sensing scene understanding? We performed several experiments to explore the influence of the DCNN structure and the scale of remote sensing scene understanding from the perspective of scene complexity. Read More

Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET) automatic 3-D registration is implemented and validated for small animal image volumes so that the high-resolution anatomical MRI information can be fused with the low spatial resolution of functional PET information for the localization of lesion that is currently in high demand in the study of tumor of cancer (oncology) and its corresponding preparation of pharmaceutical drugs. Though many registration algorithms are developed and applied on human brain volumes, these methods may not be as efficient on small animal datasets due to lack of intensity information and often the high anisotropy in voxel dimensions. Therefore, a fully automatic registration algorithm which can register not only assumably rigid small animal volumes such as brain but also deformable organs such as kidney, cardiac and chest is developed using a combination of global affine and local B-spline transformation models in which mutual information is used as a similarity criterion. Read More

In this work, we explain in detail how receptive fields, effective receptive fields, and projective fields of neurons in different layers, convolution or pooling, of a Convolutional Neural Network (CNN) are calculated. While our focus here is on CNNs, the same operations, but in the reverse order, can be used to calculate these quantities for deconvolutional neural networks. These are important concepts, not only for better understanding and analyzing convolutional and deconvolutional networks, but also for optimizing their performance in real-world applications. Read More

Previous studies by our group have shown that three-dimensional high-frequency quantitative ultrasound methods have the potential to differentiate metastatic lymph nodes from cancer-free lymph nodes dissected from human cancer patients. To successfully perform these methods inside the lymph node parenchyma, an automatic segmentation method is highly desired to exclude the surrounding thin layer of fat from quantitative ultrasound processing and accurately correct for ultrasound attenuation. In high-frequency ultrasound images of lymph nodes, the intensity distribution of lymph node parenchyma and fat varies spatially because of acoustic attenuation and focusing effects. Read More

We describe the DeepMind Kinetics human action video dataset. The dataset contains 400 human action classes, with at least 400 video clips for each action. Each clip lasts around 10s and is taken from a different YouTube video. Read More

In order to make hyperspectral image classification compu- tationally tractable, it is often necessary to select the most informative bands instead to process the whole data without losing the geometrical representation of original data. To cope with said issue, an improved un- supervised non-linear deep auto encoder (UDAE) based band selection method is proposed. The proposed UDAE is able to select the most infor- mative bands in such a way that preserve the key information but in the lower dimensions, where the hidden representation is a non-linear trans- formation that maps the original space to a space of lower dimensions. Read More