Designs of application-specific multiview / 3D video coding

Resource type
Thesis type
(Thesis) Ph.D.
Date created
Author: Gao, Yu
Many applications of multiview or three dimensional (3D) videos have been developed. This poses great challenges to video coding. We study multiview video coding (MVC), perceptual multiview video coding, 3D geometry compression, interactive multiview video streaming (IMVS), and free viewpoint video (FVV). The applications studied in this thesis can be classified into two categories. In the first category we focus on rate-distortion (RD) performance, where the distortion can be measured by mean squared errors (MSE), human visual system based MSE, or metro distance. First, we consider the application of FVV and propose a novel inpainting assisted approach to efficiently compress multiview videos. The decoder can independently recover missing data via inpainting, resulting in lower rate. Second, we study the application of just noticeable distortion (JND)-based MVC and propose to exploit inter-view or temporal redundancy of JND maps to synthesize or predict target JND maps, which are then used to adjust prediction residuals. Third, we study 3D geometry compression and propose a new 3D geometry representation. We project 3D geometry to a collection of surrounding tiles, and subsequently encode these tile images using a modified MVC. The crux of the scheme is the optimal placement of image tiles. In the second category, we study applications where real-time computation and the associated complexity also need to be considered, in addition to the RD performance. These applications include IMVS and FVV. We first consider view switching in IMVS, an application where a network client requests from server a single view at a time but can periodically switch to other views as the video is played back uninterrupted. We propose the optimal frame structure such that frequently chosen view switches are pre-computed while infrequent ones are computed in real-time upon request. On the other hand, we examine the decoder side computational complexity of view synthesis in FVV. We propose to optimally tradeoff transmission rate for decoder-side complexity. For regular view synthesis, we find the optimal subset of intermediate views to code. For a novel inpainting assisted paradigm, we argue that some expensive operations can be avoided by directly sending intra-coded blocks.
Copyright statement
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Liang, Jie
Member of collection
Attachment Size
etd8688_YGao.pdf 1.47 MB