Visually-guided beamforming for a circular microphone array

Author: 
Date created: 
2020-04-06
Identifier: 
etd20773
Keywords: 
Fisheye image
Face detection
Deep learning
Beamforming
Audio-visual speech processing
Visually-guided beamforming
Abstract: 

Beamforming is a technique which can adaptively steer the pattern of a microphone array towards or away from a target direction. Three conventional beamforming techniques are reviewed and compared with a beamformer proposed here, called MVDR-2C. Most acoustic beamformers selectively locate a single desired sound source, such as a speaker, and the beamforming performance drops significantly when two or more speakers are active. In order to deploy beamforming in a room, a circular microphone array is supplemented by a 360° camera comprising two fisheye lenses. The camera allows face detection to provide the speaker directions to the beamformer. In order to develop face/object detectors that operate directly on fisheye images, three annotated fisheye datasets are generated and used to re-train an existing face detector. Finally, several beamformers are evaluated and compared, demonstrating the clear performance advantage of the proposed one.

Document type: 
Thesis
Rights: 
This thesis may be printed or downloaded for non-commercial research and scholarly purposes. Copyright remains with the author.
File(s): 
Supervisor(s): 
Ivan Bajic
Department: 
Applied Sciences: School of Engineering Science
Thesis type: 
(Thesis) M.A.Sc.
Statistics: