Skip to main content

Real-time feature extraction of the voice for musical control

Resource type
Thesis type
((Thesis)) M.Sc.
Date created
Voiced vowel production in human speech depends both on the oscillation of vocal folds and the vocal tract shape, the latter contributing to the appearance of formants in the spectrum of the speech signal. Many speech synthesis models use feed-forward source-filter models, where the magnitude frequency response of the vocal tract is approximated by the spectral envelope of the speech signal. This thesis introduces a new analysis-by-synthesis method to identify the vocal tract area function where the user’s formants are extracted and then matched to a piecewise cylindrical waveguide model shape that produces similar spectra. When a match is found, the corresponding model shape is provided to the user. Considerations are made to improve feedback by tracking formant movement over time to account for unintended action such as dropped formants or the wavering of an untrained voice.
Copyright statement
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Scholarly level
Supervisor or Senior Supervisor
Thesis advisor: Smyth, Tamara
Member of collection
Download file Size
etd6357_AKestian.pdf 2.58 MB

Views & downloads - as of June 2023

Views: 0
Downloads: 0