Voice transformation research
Friday, June 6th, 2008
MAGIC: Media and Graphics Interdisciplinary Centre, UBC
MISTIC: Music Intelligence and Sound Technology, UVic
karl(insert at sign)karlnordstrom.ca
Project
My PhD research started in collaboration with IVL Technologies and TC-Helicon, two voice processing companies in Victoria. TC-Helicon produces vocal effects products for the music industry with a focus on pitch correction and automatic harmony creation. IVL produces hand held karaoke products that produce a variety of vocal effects while being plugged into a TV. In recent years, vocal effects have become more common including chorus based effects and even distortion.
The goal of the research was to enhance a digital effect that adds noise to a voice to simulate breathiness. If a voice already sounds breathy, it is easy to add noise to increase the perception of breathiness. However, it is difficult to add noise to voices that exhibit high-effort (i.e. when the voice sounds strained). The added noise does not blend into the voice and instead sounds like a separate stream of noise. In my research, I developed a technique to adaptively manipulate the perception of vocal effort. This enabled the added noise to blend more effectively, thereby improving the breath effect.
An important outcome of this research is to highlight the fact that Linear Prediction (LP) (a common voice modeling technique) does not appropriately model the voice. I presented Adaptive Pre-emphasis Linear Prediction (APLP) as a technique to appropriately compensate for variations in vocal effort.
Academic and Industry Collaborators
- Peter Driessen, MISTIC, Electrical Engineering, UVic
- George Tzanetakis, MISTIC, Computer Science, UBC
- Glen Rutledge, 3dB Research (previously at TC-Helicon and IVL Audio)
- Kevin Alexander, TC-Helicon
- Brian Gibson, IVL Audio
- John Esling, Linguistics
- Mathieu Lagrange, MISTIC, Computer Science
