Fan, Jianyu

Resource type

Thesis

Thesis type

(Thesis) Ph.D.

Date created

2020-07-02

Authors/Contributors

Author: Fan, Jianyu

Abstract

A soundscape is an acoustic environment perceived in context by human beings. A soundscape recording is a recording of the sound present at a given location at a given time, obtained with one or more fixed or moving microphones. Soundscape recordings play essential roles in the experience of video games, virtual reality and film. Artificial soundscapes created by professional sound designers can evoke a specific emotion in target audiences to better immerse them in multimedia content. The research in soundscape emotion recognition (SER) investigates computational systems that recognize the perceived emotion of soundscape recordings. Similarly, music emotion recognition is building computational systems that recognize the perceived emotion of music recordings.We concentrate on using novel artificial intelligence algorithms to analyze soundscape recordings and music recordings from the perspective of affective computing. The contributions of this thesis are as follows: First, we conduct empirical studies to demonstrate that listeners agree with each other regarding the perceived emotion of soundscape and music, and that it is possible to build a human-competitive model to predict the emotion perceived. Second, we curate and collect a soundscape dataset and multiple music datasets annotated with perceived emotion using crowdsourcing techniques. Third, we experiment with SER algorithms based on deep learning techniques. An evaluation of our SER models demonstrates that they perform better than each listener and state-of-the-art models. Fourth, we investigate quantifiable trends in the effect of mixing on the perceived emotion of soundscape recordings. Fifth, we build a music emotion recognition model for experimental music to investigate the ranking-based emotion recognition task. Finally, we utilize models built for SER and sound event detection to analyze and compare Chinese and Western classical music. Certain similarities between Chinese classical music and soundscape recordings permit transferability between deep learning models. These contributions present methods for automating the soundscape and music emotion recognition tasks.

Description

In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of Simon Fraser University's products or services. Internal or personal use of this material is permitted. If interested in reprinting/republishing IEEE copyrighted material for advertising or promotional purposes or for creating new collective works for resale or redistribution, please go to http://www.ieee.org/publications_standards/publications/rights/rights_link.html to learn how to obtain a License from RightsLink.

Keywords

Affective Computing, Soundscape Recording, Music, Sound Design, Perceived Emotion, Machine Learning

Identifier

etd20932

Copyright statement

Copyright is held by the author(s).

Permissions

This thesis may be printed or downloaded for non-commercial research and scholarly purposes.

Supervisor or Senior Supervisor

Thesis advisor: Pasquier, Philippe

Language

English

Member of collection

Interactive Arts and Technology Theses

Download file	Size
input_data\21346\etd20932.pdf	16.74 MB

Advances in soundscape and music emotion recognition

Keywords

Views & downloads - as of June 2023