Skip to main content

Constructions of high-performance face recognition pipeline and embedded deep learning framework

Resource type
Thesis type
(Thesis) M.A.Sc.
Date created
2018-06-28
Authors/Contributors
Author (aut): Ng, Him Wai
Abstract
Face recognition has been very popular in many research and commercial studies. Due to the uniqueness of human faces, a robust face recognition system can be an alternative to biometrics such as the fingerprint or eye iris recognition in security systems. Recent development in deep learning contributed to many of the success in solving difficult computer vision tasks, including face recognition. In this thesis, a thorough study is presented to walk through the construction of a robust face recognition pipeline and to evaluate the components in each stage of the pipeline. The pipeline consists of four components, face detection module, face alignment module, metric space face feature extraction module, and feature identification module. Different implementations of each module are presented and compared. The performance of each implementation of the system is evaluated on multiple datasets. The combination of a coarse-to-fine convolutional neural network (CNN) based face detection, geometric-based face alignment and discriminative features learning with additive angular margin method are found to achieve the highest accuracies in all datasets. One drawback of this face recognition pipeline is that it consumes a lot of computational resources, making it hard to be deployed on embedded hardware. It would be beneficial to develop a method that allows advanced deep learning algorithms to be run on resource-limited hardware, such that many of the existing devices can become intelligent with low cost. In this thesis, a novel lapped CNN (LCNN) architecture that is suitable for resource-limited embedded systems is developed. The LCNN uses a divide-and-conquer approach to apply convolution to a high-resolution image on embedded hardware. The LCNN first applies convolution to sub-patches of the image, then merges the resulting outputs to form the actual convolution. The resulting output is identical to that of applying a larger-scale convolution to the entire high-resolution image, except that the convolution operations on the sub-patches can be processed sequentially or parallelly by resource-limited hardware.
Identifier
etd10772
Copyright statement
Copyright is held by the author.
Permissions
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor (ths): Liang, Jie
Member of collection
Model
English

Views & downloads - as of June 2023

Views: 15
Downloads: 0