Posts Tagged ‘linkedin’

Synthesizing speech and restructuring code

Monday, May 4th, 2009

I am currently working at the Media and Graphics Interdisciplinary Centre (MAGIC) at UBC. We are developing software that uses hand gestures to dynamically control the synthesis of speech and singing. My role is to make the speech synthesis system sound intelligible and to restructure the code to make it maintainable. You can learn more about the project at the Visual Voice website.

Academic Collaborators:

Choir effect

Friday, January 30th, 2009


While working for TC-Helicon, I developed an audio effect to convert singing voices into the sound of a choir. For example, starting from four harmony voices (mp3 wav), the effect multiplies the apparent number of voices, resulting in a choir effect (mp3 wav).

This effect is included in one of TC-Helicon’s latest products, the VoiceLive 2. You can find another sample of the choir effect in action in the sidebar of the VoiceLive 2 web page.

APLP for phrases

Monday, June 2nd, 2008

I have now implemented Adaptive Pre-emphasis Linear Prediction (APLP) for phrases. The goal of this implementation of APLP is to transform high-effort voices into breathy voices.

Original voice:
I started with this sound sample: wav. With this kind of voice, it is difficult to add breathiness in a way that sounds natural.

Breath effect with constant pre-emphasis linear prediction (LP):
Constant pre-emphasis LP was carried out, noise was added to the residual, and the voice was resynthesized: wav. This is the technique used in current voice processors to apply a breath effect.

Breath effect with APLP:
APLP can be used to transform the spectral envelope of the voice to match that of a typical breathy voice. This reduces the perceived vocal effort and improves the blending of noise into the voice: wav.

Breath effect with APLP and breath modulation:
The amount of breathiness in a voice varies with the amount of effort. As such, it makes sense to add less breathiness during times of excessive effort. The APLP algorithm was further improved by modulating the quantity of added noise according to the quantity of vocal effort: wav.

Now listen to the original voice again.

With the breath effect, some voices work better than others. Here is how the breath effect sounds on another voice:

Original voice: wav.
Breath effect with constant pre-emphasis LP: wav.
Breath effect with APLP: wav.
Breath effect with APLP and breath modulation: wav.

Technical Note:
Previous iterations of the algorithm included glottal closure detection to improve the blending of noise into the voice. Glottal closure detection was eliminated from this iteration of the algorithm. This ensures that the sound samples are representative of what can be achieved on real-time voice signals in a musical context.