Singing Voice Synthesis

Project Goal

To create a synthesized singing voice so beautiful that it is indistinguishable from natural human singing. The inputs to the system are arbitrary lyrics and melody; the output is a natural-sounding singing voice.

Singing Voice Synthesis differs from traditional speech synthesis research in two main aspects. First, the focus is on natural-sounding and beautiful voice, as opposed to intelligible speech. Second, the precise control of vowel duration and pitch is very important in achieving a singing voice that closely matches the melody. Our approach uses a contatenative synthesizer combined with some novel signal processing and speech analysis.


You can try out the NTT Singing Voice Synthesizer online (currently available in Japanese only). On this website, you can select a melody and enter lyrics in Japanese kana. The system will return you a wave file of the synthesized voice.

Below are some sample wave files generated from the NTT system:

Other Resources

The OGI Festival Singer system webpage, which contains some software and papers, is a good resouce for learning about Singing Voice Synthesis.

Last updated: Aug 21, 2004