3d Scene Understanding From A Single Image

Download 3d Scene Understanding From A Single Image full books in PDF, epub, and Kindle. Read online 3d Scene Understanding From A Single Image ebook anywhere anytime directly on your device. Fast Download speed and no annoying ads. We cannot guarantee that every ebooks is available!

3D Scene Understanding from a Single Image

Author : Wei Zeng
Publisher :
ISBN 13 : 9789493197602
Total Pages : 101 pages
Book Rating : 4.1/5 (976 download)

DOWNLOAD NOW!

Book Synopsis 3D Scene Understanding from a Single Image by : Wei Zeng

Download or read book 3D Scene Understanding from a Single Image written by Wei Zeng and published by . This book was released on 2021 with total page 101 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Seeing the World Behind the Image

Author : Derek Hoiem
Publisher :
ISBN 13 :
Total Pages : 147 pages
Book Rating : 4.:/5 (299 download)

DOWNLOAD NOW!

Book Synopsis Seeing the World Behind the Image by : Derek Hoiem

Download or read book Seeing the World Behind the Image written by Derek Hoiem and published by . This book was released on 2007 with total page 147 pages. Available in PDF, EPUB and Kindle. Book excerpt: Abstract: "When humans look at an image, they see not just a pattern of color and texture, but the world behind the image. In the same way, computer vision algorithms must go beyond the pixels and reason about the underlying scene. In this dissertation, we propose methods to recover the basic spatial layout from a single image and begin to investigate its use as a foundation for scene understanding. Our spatial layout is a description of the 3D scene in terms of surfaces, occlusions, camera viewpoint, and objects. We propose a geometric class representation, a coarse categorization of surfaces according to their 3D orientations, and learn appearance-based models of geometry to identify surfaces in an image. These surface estimates serve as a basis for recovering the boundaries and occlusion relationships of prominent objects. We further show that simple reasoning about camera viewpoint and object size in the image allows accurate inference of the viewpoint and greatly improves object detection. Finally, we demonstrate the potential usefulness of our methods in applications to 3D reconstruction, scene synthesis, and robot navigation. Scene understanding from a single image requires strong assumptions about the world. We show that the necessary assumptions can be modeled statistically and learned from training data. Our work demonstrates the importance of robustness through a wide variety of image cues, multiple segmentations, and a general strategy of soft decisions and gradual inference of image structure. Above all, our work manifests the tremendous amount of 3D information that can be gleaned from a single image. Our hope is that this dissertation will inspire others to further explore how computer vision can go beyond pattern recognition and produce an understanding of the environment."

Representations and Techniques for 3D Object Recognition and Scene Interpretation

Author : Derek Hoiem
Publisher : Morgan & Claypool Publishers
ISBN 13 : 1608457281
Total Pages : 172 pages
Book Rating : 4.6/5 (84 download)

DOWNLOAD NOW!

Book Synopsis Representations and Techniques for 3D Object Recognition and Scene Interpretation by : Derek Hoiem

Download or read book Representations and Techniques for 3D Object Recognition and Scene Interpretation written by Derek Hoiem and published by Morgan & Claypool Publishers. This book was released on 2011 with total page 172 pages. Available in PDF, EPUB and Kindle. Book excerpt: One of the grand challenges of artificial intelligence is to enable computers to interpret 3D scenes and objects from imagery. This book organizes and introduces major concepts in 3D scene and object representation and inference from still images, with a focus on recent efforts to fuse models of geometry and perspective with statistical machine learning. The book is organized into three sections: (1) Interpretation of Physical Space; (2) Recognition of 3D Objects; and (3) Integrated 3D Scene Interpretation. The first discusses representations of spatial layout and techniques to interpret physical scenes from images. The second section introduces representations for 3D object categories that account for the intrinsically 3D nature of objects and provide robustness to change in viewpoints. The third section discusses strategies to unite inference of scene geometry and object pose and identity into a coherent scene interpretation. Each section broadly surveys important ideas from cognitive science and artificial intelligence research, organizes and discusses key concepts and techniques from recent work in computer vision, and describes a few sample approaches in detail. Newcomers to computer vision will benefit from introductions to basic concepts, such as single-view geometry and image classification, while experts and novices alike may find inspiration from the book's organization and discussion of the most recent ideas in 3D scene understanding and 3D object recognition. Specific topics include: mathematics of perspective geometry; visual elements of the physical scene, structural 3D scene representations; techniques and features for image and region categorization; historical perspective, computational models, and datasets and machine learning techniques for 3D object recognition; inferences of geometrical attributes of objects, such as size and pose; and probabilistic and feature-passing approaches for contextual reasoning about 3D objects and scenes. Table of Contents: Background on 3D Scene Models / Single-view Geometry / Modeling the Physical Scene / Categorizing Images and Regions / Examples of 3D Scene Interpretation / Background on 3D Recognition / Modeling 3D Objects / Recognizing and Understanding 3D Objects / Examples of 2D 1/2 Layout Models / Reasoning about Objects and Scenes / Cascades of Classifiers / Conclusion and Future Directions

3D Scene Understanding

Author : Zhaoyin Jia
Publisher :
ISBN 13 :
Total Pages : 153 pages
Book Rating : 4.:/5 (876 download)

DOWNLOAD NOW!

Book Synopsis 3D Scene Understanding by : Zhaoyin Jia

Download or read book 3D Scene Understanding written by Zhaoyin Jia and published by . This book was released on 2014 with total page 153 pages. Available in PDF, EPUB and Kindle. Book excerpt: Segmentation is one of the fundamental computer vision problems and has been investigated over years. In this thesis, we present algorithms for RGB-D image segmentation, and more importantly, the additional information that can be inferred from segmentations: depth ordering, 3D surfaces, occlusion boundaries and volumes of objects. All these clues lead to a more comprehensive 3D understanding of the scene as well as a higher level RGB-D interpretation. Also in return some of these clues can provide important feedbacks and improve the final scene segmentation performance. We start by performing 3D depth interpretation from 2D color images only. We discover that the segment shapes enable us to learn the depth orderings of the objects. Specifically, from the initial segmentation we develop features to encode the information captured in boundaries and junctions. After a supervised learning procedure, our algorithm is able to produce a 3D depth ordering map from a single 2D color image. Secondly, we proceed to 3D scene understanding using RGB-D images. The recent development of the depth sensors improves the performance of the traditional computer vision algorithms by a margin. Therefore, besides using one single image, we incorporate depth information along with it, and parse the scene based on 3D interpretation. We aim at the applications such as 3D point interpolation, boundary detection and scene segmentation. In detail, we propose algorithm for 3D surface segmentation, and show that combining this 3D surface information with 2D color image achieves better performance for 3D interpolation. After that, we use both 2D color and 3D depth channels to find the occlusion and connected boundaries given a RGB-D scene. This serves as an extended 3D scene interpretation with a better understanding of occlusions between objects. Finally we perform a 3D volumetric reasoning of the RGB-D image with support and stability. Objects occupy physical space and obey physical laws. To truly understand a scene, we must reason about the space that objects in it occupy, and how each objects is supported stably by each other. In other words, we seek to understand which objects would, if moved, cause other objects to fall. This 3D volumetric reasoning is important for many scene understanding tasks, ranging from segmentation of objects to perception of a rich 3D, physically well-founded, interpretations of the scene. In this thesis, we propose a new algorithm to parse RGB-D images with 3D block units while jointly reasoning about the segments, volumes, supporting relationships and object stability. Our algorithm is based on the intuition that a good 3D representation of the scene is one that fits the depth data well, and is a stable, self-supporting arrangement of objects (i.e., one that does not topple). We design an energy function for representing the quality of the block representation based on these properties. Our algorithm fits 3D blocks to the depth values corresponding to image segments, and iteratively optimizes the energy function. Our proposed algorithm is the first to consider stability of objects in complex arrangements for reasoning about the underlying structure of the scene. Experimental results show that our stability-reasoning framework improves RGB-D segmentation and scene volumetric representation.

Multimodal Scene Understanding

Author : Michael Yang
Publisher : Academic Press
ISBN 13 : 0128173599
Total Pages : 422 pages
Book Rating : 4.1/5 (281 download)

DOWNLOAD NOW!

Book Synopsis Multimodal Scene Understanding by : Michael Yang

Download or read book Multimodal Scene Understanding written by Michael Yang and published by Academic Press. This book was released on 2019-07-16 with total page 422 pages. Available in PDF, EPUB and Kindle. Book excerpt: Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms. Researchers collecting and analyzing multi-sensory data collections – for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful. Contains state-of-the-art developments on multi-modal computing Shines a focus on algorithms and applications Presents novel deep learning topics on multi-sensor fusion and multi-modal deep learning

Computer Vision -- ECCV 2014

Author : David Fleet
Publisher : Springer
ISBN 13 : 331910599X
Total Pages : 855 pages
Book Rating : 4.3/5 (191 download)

DOWNLOAD NOW!

Book Synopsis Computer Vision -- ECCV 2014 by : David Fleet

Download or read book Computer Vision -- ECCV 2014 written by David Fleet and published by Springer. This book was released on 2014-08-14 with total page 855 pages. Available in PDF, EPUB and Kindle. Book excerpt: The seven-volume set comprising LNCS volumes 8689-8695 constitutes the refereed proceedings of the 13th European Conference on Computer Vision, ECCV 2014, held in Zurich, Switzerland, in September 2014. The 363 revised papers presented were carefully reviewed and selected from 1444 submissions. The papers are organized in topical sections on tracking and activity recognition; recognition; learning and inference; structure from motion and feature matching; computational photography and low-level vision; vision; segmentation and saliency; context and 3D scenes; motion and 3D scene analysis; and poster sessions.

3D Scene and Object Parsing from a Single Image

Author : Chuhang Zou
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (114 download)

DOWNLOAD NOW!

Book Synopsis 3D Scene and Object Parsing from a Single Image by : Chuhang Zou

Download or read book 3D Scene and Object Parsing from a Single Image written by Chuhang Zou and published by . This book was released on 2019 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt:

Reconstruction and Analysis of 3D Scenes

Author : Martin Weinmann
Publisher : Springer
ISBN 13 : 3319292463
Total Pages : 250 pages
Book Rating : 4.3/5 (192 download)

DOWNLOAD NOW!

Book Synopsis Reconstruction and Analysis of 3D Scenes by : Martin Weinmann

Download or read book Reconstruction and Analysis of 3D Scenes written by Martin Weinmann and published by Springer. This book was released on 2016-03-17 with total page 250 pages. Available in PDF, EPUB and Kindle. Book excerpt: This unique work presents a detailed review of the processing and analysis of 3D point clouds. A fully automated framework is introduced, incorporating each aspect of a typical end-to-end processing workflow, from raw 3D point cloud data to semantic objects in the scene. For each of these components, the book describes the theoretical background, and compares the performance of the proposed approaches to that of current state-of-the-art techniques. Topics and features: reviews techniques for the acquisition of 3D point cloud data and for point quality assessment; explains the fundamental concepts for extracting features from 2D imagery and 3D point cloud data; proposes an original approach to keypoint-based point cloud registration; discusses the enrichment of 3D point clouds by additional information acquired with a thermal camera, and describes a new method for thermal 3D mapping; presents a novel framework for 3D scene analysis.

3D Scene Reconstruction from a Single Image with Duplicate Objects Using Template Models

Author : 張傑程
Publisher :
ISBN 13 :
Total Pages : 62 pages
Book Rating : 4.:/5 (12 download)

DOWNLOAD NOW!

Book Synopsis 3D Scene Reconstruction from a Single Image with Duplicate Objects Using Template Models by : 張傑程

Download or read book 3D Scene Reconstruction from a Single Image with Duplicate Objects Using Template Models written by 張傑程 and published by . This book was released on 2017 with total page 62 pages. Available in PDF, EPUB and Kindle. Book excerpt:

3D Scene Modeling and Understanding from Image Sequences

Author : Hao Tang
Publisher :
ISBN 13 : 9781267925749
Total Pages : 188 pages
Book Rating : 4.9/5 (257 download)

DOWNLOAD NOW!

Book Synopsis 3D Scene Modeling and Understanding from Image Sequences by : Hao Tang

Download or read book 3D Scene Modeling and Understanding from Image Sequences written by Hao Tang and published by . This book was released on 2013 with total page 188 pages. Available in PDF, EPUB and Kindle. Book excerpt: A new method for 3D modeling is proposed, which generates a content-based 3D mosaic (CB3M) representation for long video sequences of 3D, dynamic urban scenes captured by a camera on a mobile platform. In the first phase, a set of parallel-perspective (pushbroom) mosaics with varying viewing directions is generated to capture both the 3D and dynamic aspects of the scene under the camera coverage. In the second phase, a unified patch-based stereo matching algorithm is applied to extract parametric representations of the color, structure and motion of the dynamic and/or 3D objects in urban scenes, where a lot of planar surfaces exist. Multiple pairs of stereo mosaics are used for facilitating reliable stereo matching, occlusion handling, accurate 3D reconstruction and robust moving target detection. The outcome of this phase is a CB3M representation, which is a highly compressed visual representation for a dynamic 3D scene, and has object contents of both 3D and motion information. In the third phase, a multi-layer based scene understanding algorithm is proposed, resulting in a planar surface model for higher-level object representations. Experimental results are given for both simulated and several different real video sequences of large-scale 3D scenes to show the accuracy and effectiveness of the representation. We also show the patch-based stereo matching algorithm and the CB3M representation can be generalized to 3D modeling with perspective views using either a single camera or a stereovision head on a ground mobile platform or a pedestrian. Applications of the proposed method include airborne or ground video surveillance, 3D urban scene modeling, traffic survey, transportation planning and the visual aid for perception and navigation of blind people.

3D Scene Reconstruction and Understanding from Single Shot Pictures

Author : Alfredo García González
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (112 download)

DOWNLOAD NOW!

Book Synopsis 3D Scene Reconstruction and Understanding from Single Shot Pictures by : Alfredo García González

Download or read book 3D Scene Reconstruction and Understanding from Single Shot Pictures written by Alfredo García González and published by . This book was released on 2012 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Augmented reality mixes computer generated graphics with real imaging using computer vision techniques. However, nowadays, augmented reality is still a very young field of research, and its applications usually involve predefi ned tags. This thesis has been directed to use computer vision and arti cial intelligence techniques to explore the viability of using natural landmarks as key points for computer graphics reference. Moreover, there are many techniques that infer 3D scenes from images like stereo-vision, structures from motion, depth images, shape from shading, etc. The aim of this work is to find a way of doing this from one single shot image. Finally, new virtual elements will be integrated on the final scene using contextual colors. The followed methodology has been to automatically segment an image in small planar surfaces using di fferent granularities of small regions. Each region is assumed to likely lie on only one planar surface, and thus it the 3D face that it came from can be inferred. The normal vector of the planes corresponding to the 3D faces are approximated along a discrete set of orientations. In addition, some regions do not have a regular orientation and thus, they are assumed as a texturized or porous region. Inferring the fi nal 3D orientation and location from the set of labelled regions is a non-trivial task. This work proposes a method based on the coherent topology of the neighborhood. The 3D position of each point of a region is found and a 3D scenario can be obtained. After that, the regions of the original images are textured in the 3D reconstructed faces. Finally, a color transfer approach is used to integrate new 3D objects inside the final scene.

Single View 3D Reconstruction and Parsing Using Geometric Commonsense for Scene Understanding

Author : Chengcheng Yu
Publisher :
ISBN 13 :
Total Pages : 105 pages
Book Rating : 4.:/5 (17 download)

DOWNLOAD NOW!

Book Synopsis Single View 3D Reconstruction and Parsing Using Geometric Commonsense for Scene Understanding by : Chengcheng Yu

Download or read book Single View 3D Reconstruction and Parsing Using Geometric Commonsense for Scene Understanding written by Chengcheng Yu and published by . This book was released on 2017 with total page 105 pages. Available in PDF, EPUB and Kindle. Book excerpt: My thesis studies this topic in three perspective: (1) 3D scene reconstruction to understand the 3D structure of a scene. (2) Geometry and physics reasoning to understand the relationships of objects in a scene. (3) The interaction between human action and objects in a scene. Specifically, the 3D reconstruction builds a unified grammatical framework capable of reconstructing a variety of scene types (e.g., urban, campus, county etc.) from a single input image. The key idea of our approach is to study a novel commonsense reasoning framework that mainly exploits two types of prior knowledges: (i) prior distributions over a single dimension of objects, e.g., that the length of a sedan is about 4.5 meters; (ii) pair-wise relationships between the dimensions of scene entities, e.g., that the length of a sedan is shorter than a bus. These unary or relative geometric knowledge, once extracted, are fairly stable across different types of natural scenes, and are informative for enhancing the understanding of various scenes in both 2D images and 3D world. Methodologically, we propose to construct a hierarchical graph representation as a unified representation of the input image and related geometric knowledge. We formulate these objectives with a unified probabilistic formula and develop a data-driven Monte Carlo method to infer the optimal solution with both bottom-to-up and top-down computations. Results with comparisons on public datasets showed that our method clearly outperforms the alternative methods. For geometry and physics reasoning, we present an approach for scene understanding by reasoning physical stability of objects from point cloud. We utilize a simple observation that, by human design, objects in static scenes should be stable with respect to gravity. This assumption is applicable to all scene categories and poses useful constraints for the plausible interpretations (parses) in scene understanding. Our method consists of two major steps: 1) geometric reasoning: recovering solid 3D volumetric primitives from defective point cloud; and 2) physical reasoning: grouping the unstable primitives to physically stable objects by optimizing the stability and the scene prior. We propose to use a novel disconnectivity graph (DG) to represent the energy landscape and use a Swendsen-Wang Cut (MCMC) method for optimization. In experiments, we demonstrate that the algorithm achieves substantially better performance for i) object segmentation, ii) 3D volumetric recovery of the scene, and iii) better parsing result for scene understanding in comparison to state-of-the-art methods in both public dataset and our own new dataset. Detecting potential dangers in the environment is a fundamental ability of living beings. In order to endure such ability to a robot, my thesis presents an algorithm for detecting potential falling objects, i.e. physically unsafe objects, given an input of 3D point clouds captured by the range sensors. We formulate the falling risk as a probability or a potential that an object may fall given human action or certain natural disturbances, such as earthquake and wind. Our approach differs from traditional object detection paradigm, it first infers hidden and situated "causes (disturbance) of the scene, and then introduces intuitive physical mechanics to predict possible "effects (falls) as consequences of the causes. In particular, we infer a disturbance field by making use of motion capture data as a rich source of common human pose movement. We show that, by applying various disturbance fields, our model achieves a human level recognition rate of potential falling objects on a dataset of challenging and realistic indoor scenes.

3D Scene Reconstruction from a Single Image with Symmetry Information

Author : 王心平
Publisher :
ISBN 13 :
Total Pages : 51 pages
Book Rating : 4.:/5 (114 download)

DOWNLOAD NOW!

Book Synopsis 3D Scene Reconstruction from a Single Image with Symmetry Information by : 王心平

Download or read book 3D Scene Reconstruction from a Single Image with Symmetry Information written by 王心平 and published by . This book was released on 2019 with total page 51 pages. Available in PDF, EPUB and Kindle. Book excerpt:

2d+3d Indoor Scene Understanding from a Single Monocular Image

Author : Wei Zhuo
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (145 download)

DOWNLOAD NOW!

Book Synopsis 2d+3d Indoor Scene Understanding from a Single Monocular Image by : Wei Zhuo

Download or read book 2d+3d Indoor Scene Understanding from a Single Monocular Image written by Wei Zhuo and published by . This book was released on 2018 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Scene understanding, as a broad field encompassing many subtopics, has gained great interest in recent years. Among these subtopics, indoor scene understanding, having its own specific attributes and challenges compared to outdoor scene under- standing, has drawn a lot of attention. It has potential applications in a wide variety of domains, such as robotic navigation, object grasping for personal robotics, augmented reality, etc. To our knowledge, existing research for indoor scenes typically makes use of depth sensors, such as Kinect, that is however not always available. In this thesis, we focused on addressing the indoor scene understanding tasks in a general case, where only a monocular color image of the scene is available. Specifically, we first studied the problem of estimating a detailed depth map from a monocular image. Then, benefiting from deep-learning-based depth estimation, we tackled the higher-level tasks of 3D box proposal generation, and scene parsing with instance segmentation, semantic labeling and support relationship inference from a monocular image. Our research on indoor scene understanding provides a comprehensive scene interpretation at various perspectives and scales. For monocular image depth estimation, previous approaches are limited in that they only reason about depth locally on a single scale, and do not utilize the important information of geometric scene structures. Here, we developed a novel graphical model, which reasons about detailed depth while leveraging geometric scene structures at multiple scales. For 3D box proposals, to our best knowledge, our approach constitutes the first attempt to reason about class-independent 3D box proposals from a single monocular image. To this end, we developed a novel integrated, differentiable framework that estimates depth, extracts a volumetric scene representation and generates 3D proposals. At the core of this framework lies a novel residual, differentiable truncated signed distance function module, which is able to handle the relatively low accuracy of the predicted depth map. For scene parsing, we tackled its three subtasks of instance segmentation, semantic labeling, and the support relationship inference on instances. Existing work typically reasons about these individual subtasks independently. Here, we leverage the fact that they bear strong connections, which can facilitate addressing these sub- tasks if modeled properly. To this end, we developed an integrated graphical model that reasons about the mutual relationships of the above subtasks. In summary, in this thesis, we introduced novel and effective methodologies for each of three indoor scene understanding tasks, i.e., depth estimation, 3D box proposal generation, and scene parsing, and exploited the dependencies on depth estimates of the latter two tasks. Evaluation on several benchmark datasets demonstrated the effectiveness of our algorithms and the benefits of utilizing depth estimates for higher-level tasks.

Probabilistic Models for 3D Urban Scene Understanding from Movable Platforms

Author : Andreas Geiger
Publisher : KIT Scientific Publishing
ISBN 13 : 3731500817
Total Pages : 196 pages
Book Rating : 4.7/5 (315 download)

DOWNLOAD NOW!

Book Synopsis Probabilistic Models for 3D Urban Scene Understanding from Movable Platforms by : Andreas Geiger

Download or read book Probabilistic Models for 3D Urban Scene Understanding from Movable Platforms written by Andreas Geiger and published by KIT Scientific Publishing. This book was released on 2014-07-29 with total page 196 pages. Available in PDF, EPUB and Kindle. Book excerpt: This work is a contribution to understanding multi-object traffic scenes from video sequences. All data is provided by a camera system which is mounted on top of the autonomous driving platform AnnieWAY. The proposed probabilistic generative model reasons jointly about the 3D scene layout as well as the 3D location and orientation of objects in the scene. In particular, the scene topology, geometry as well as traffic activities are inferred from short video sequences.

Computing 3D Scene from a Single Image by Bottom-up/top-down Bayesian Inference

Author : Han Feng
Publisher :
ISBN 13 :
Total Pages : 264 pages
Book Rating : 4.:/5 (696 download)

DOWNLOAD NOW!

Book Synopsis Computing 3D Scene from a Single Image by Bottom-up/top-down Bayesian Inference by : Han Feng

Download or read book Computing 3D Scene from a Single Image by Bottom-up/top-down Bayesian Inference written by Han Feng and published by . This book was released on 2005 with total page 264 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Two-dimensional Plus Three-dimensional Rich Data Approach to Scene Understanding

Author : Jianxiong Xiao
Publisher :
ISBN 13 :
Total Pages : 227 pages
Book Rating : 4.:/5 (868 download)

DOWNLOAD NOW!

Book Synopsis Two-dimensional Plus Three-dimensional Rich Data Approach to Scene Understanding by : Jianxiong Xiao

Download or read book Two-dimensional Plus Three-dimensional Rich Data Approach to Scene Understanding written by Jianxiong Xiao and published by . This book was released on 2013 with total page 227 pages. Available in PDF, EPUB and Kindle. Book excerpt: On your one-minute walk from the coffee machine to your desk each morning, you pass by dozens of scenes - a kitchen, an elevator, your office - and you effortlessly recognize them and perceive their 3D structure. But this one-minute scene-understanding problem has been an open challenge in computer vision since the field was first established 50 years ago. In this dissertation, we aim to rethink the path researchers took over these years, challenge the standard practices and implicit assumptions in the current research, and redefine several basic principles in computational scene understanding. The key idea of this dissertation is that learning from rich data under natural setting is crucial for finding the right representation for scene understanding. First of all, to overcome the limitations of object-centric datasets, we built the Scene Understanding (SUN) Database, a large collection of real-world images that exhaustively spans all scene categories. This scene-centric dataset provides a more natural sample of human visual world, and establishes a realistic benchmark for standard 2D recognition tasks. However, while an image is a 2D array, the world is 3D and our eyes see it from a viewpoint, but this is not traditionally modeled. To obtain a 3D understanding at high-level, we reintroduce geometric figures using modern machinery. To model scene viewpoint, we propose a panoramic place representation to go beyond aperture computer vision and use data that is close to natural input for human visual system. This paradigm shift toward rich representation also opens up new challenges that require a new kind of big data - data with extra descriptions, namely rich data. Specifically, we focus on a highly valuable kind of rich data - multiple viewpoints in 3D - and we build the SUN3D database to obtain an integrated place-centric representation of scenes. We argue for the great importance of modeling the computer's role as an agent in a 3D scene, and demonstrate the power of place-centric scene representation.