Resource type
Thesis type
(Thesis) Ph.D.
Date created
2011-09-22
Authors/Contributors
Author: Xiu, Xiaoyu
Abstract
Emerging video applications are being developed where multiple views of a scene are captured. Two central issues in the deployment of future multiview video (MVV) systems are compression efficiency and interactive video experience, which makes it necessary to develop advanced technologies on multiview video coding (MVC) and interactive multiview video streaming (IMVS). The former aims at efficient compression of all MVV data in a ratedistortion (RD) optimal manner by exploiting both temporal and inter-view redundancy, while the latter offers a viewer the ability to freely interact with MVV data, such that she can periodically request her desired viewpoint as the video is played back. Based on the observation that MVC and IMVS are fundamentally different MVV problems, in this thesis, we focus on developing different algorithms for practical MVC and IMVS designs.The first part of the thesis focuses on our research works on MVC. We first develop projective rectification-based view interpolation and extrapolation methods and apply them to MVC. Experimental results show that these schemes can achieve better RD performance than the current joint multiview video coding (JMVC) standard as well as view interpolation and extrapolation-based MVC schemes without using rectification. To explain the experimental results, we also develop mathematical models for the rectification-based view interpolation and extrapolation, from which we develop an improved theoretical model to compare the performances of various MVC schemes. Simulation results can verify the experimental results very well. In the second part of the thesis, we propose three major technological improvements to existing IMVS works to enhance its interactivity experience and implement it in a realistic network condition. First, in addition to camera-captured views, we make available additional virtual views between each pair of captured views for viewers’ selection, by transmitting both texture and depth maps of neighboring captured views and synthesizing intermediate views at decoder using depth-based image rendering (DIBR). Second, we construct a Markovian view-switching model that more accurately captures viewers’ behaviors. Third, we optimize frame structures and schedule the transmission of frames in a network-delay-cognizant manner, so that viewers can enjoy zero-delay view-switching even over transmission network with non-negligible network delay.
Document
Identifier
etd6873
Copyright statement
Copyright is held by the author.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Liang, Jie
Member of collection
Download file | Size |
---|---|
etd6873_XXiu.pdf | 11.12 MB |