A voice font is a computer-generated voice that can be controlled by specifying parameters such as speed and pitch and made to pronounce text input. The concept is akin to that of a text font or a MIDI instrument in the sense that the same input may easily be represented in several different ways based on the design of each font. In spite of current shortcomings in the underlying technology for voice fonts, screen readers and other devices used to enhance accessibility of text to persons with disabilities, can benefit from having more than one default voice font. This happens in the same way that users of a traditional computer word processor benefit from having more than one text font.
The synthesized voice created by using a voice font tends to have a slightly unnatural tone. Human voices are very prone to change with the speaker's mood and several other factors that aren't programmed into computerized voices. Voice font software on the Macintosh system tries to get around this by providing tags to change some components of the voice, such as pitch. The Natural Voices software in the sources section allows defining acronym pronunciation and speech rate, as well as other things. Even though speech synthesis has existed since around 1930, according to that source, and the Speech synthesis article, it is difficult to fool experienced listeners into believing that the voice is indeed human.
This may be similar to the difficulty in achieving true Artificial Intelligence that can actually pass a Turing Test by presenting spectators with something indistinguishable from what it is trying to simulate.
Like its text counterpart, each voice font can supply a different experience and provide a selection for different purposes. The simplest one is to select a voice font from a group in order to get the clearest one, or to choose the one with a speed that is appropriate for different settings.
Another use for voice fonts is in electronic music. A commonly available set of synthetic voices from Macintosh computers can be used to enhance the mood of certain music pieces that need a voice but where the author feels that providing a human voice is not in their interests. Here, male voices can be combined in a choir to provide the tenor and bass for a particular piece, and female voices can be added to fill in other parts of the melody—resulting in a choir that consists of speech synthesis rather than different singers, or presenting a female voice when none are available to the arranger of the music.
Certain Macintosh clients of instant messaging services such as AOL Instant Messenger have had the option of reading incoming messages using the system's voice fonts. When message receiver has stepped away from the computer, or temporarily put away the part of the screen showing the incoming text, the computer reads the message outloud. This allows the user to continue with their other tasks without needing to view the incoming text.
- dot-font: Voice Fonts Speak Volumes
- Project: AT&T Natural Voices Text-to-Speech
- Tone changes using dictionaries