Voiced vowel production in human speech depends both on the oscillation of vocal folds and the vocal tract shape, the latter contributing to the appearance of formants in the spectrum of the speech signal. Many speech synthesis models use feed-forward source-filter models, where the magnitude frequency response of the vocal tract is approximated by the spectral envelope of the speech signal. This thesis introduces a new analysis-by-synthesis method to identify the vocal tract area function where the user’s formants are extracted and then matched to a piecewise cylindrical waveguide model shape that produces similar spectra. When a match is found, the corresponding model shape is provided to the user. Considerations are made to improve feedback by tracking formant movement over time to account for unintended action such as dropped formants or the wavering of an untrained voice.
Copyright is held by the author.
The author granted permission for the file to be printed and for the text to be copied and pasted.
Supervisor or Senior Supervisor
Thesis advisor: Smyth, Tamara
Member of collection