Multi-scale Methods for Omnidirectional Stereo with Application to Real-time Virtual Walkthroughs

Title: Multi-scale Methods for Omnidirectional Stereo with Application to Real-time Virtual Walkthroughs
Authors: Brunton, Alan P
Date: 2012
Abstract: This thesis addresses a number of problems in computer vision, image processing, and geometry processing, and presents novel solutions to these problems. The overarching theme of the techniques presented here is a multi-scale approach, leveraging mathematical tools to represent images and surfaces at different scales, and methods that can be adapted from one type of domain (eg., the plane) to another (eg., the sphere). The main problem addressed in this thesis is known as stereo reconstruction: reconstructing the geometry of a scene or object from two or more images of that scene. We develop novel algorithms to do this, which work for both planar and spherical images. By developing a novel way to formulate the notion of disparity for spherical images, we are able effectively adapt our algorithms from planar to spherical images. Our stereo reconstruction algorithm is based on a novel application of distance transforms to multi-scale matching. We use matching information aggregated over multiple scales, and enforce consistency between these scales using distance transforms. We then show how multiple spherical disparity maps can be efficiently and robustly fused using visibility and other geometric constraints. We then show how the reconstructed point clouds can be used to synthesize a realistic sequence of novel views, images from points of view not captured in the input images, in real-time. Along the way to this result, we address some related problems. For example, multi-scale features can be detected in spherical images by convolving those images with a filterbank, generating an overcomplete spherical wavelet representation of the image from which the multiscale features can be extracted. Convolution of spherical images is much more efficient in the spherical harmonic domain than in the spatial domain. Thus, we develop a GPU implementation for fast spherical harmonic transforms and frequency domain convolutions of spherical images. This tool can also be used to detect multi-scale features on geometric surfaces. When we have a point cloud of a surface of a particular class of object, whether generated by stereo reconstruction or by some other modality, we can use statistics and machine learning to more robustly estimate the surface. If we have at our disposal a database of surfaces of a particular type of object, such as the human face, we can compute statistics over this database to constrain the possible shape a new surface of this type can take. We show how a statistical spherical wavelet shape prior can be used to efficiently and robustly reconstruct a face shape from noisy point cloud data, including stereo data.
CollectionThèses, 2011 - // Theses, 2011 -
Brunton_Alan_2012_thesis.pdfthesis document37.45 MBAdobe PDFOpen
Brunton_Alan_2012_thesis_data1.avivideo accompanying Chapter 510.68 MBmpeg4 video created with MEncoderOpen
Brunton_Alan_2012_thesis_data2.avivideo accompanying Chapter 616.3 MBmpeg4 video created with MEncoderOpen