MBROLA

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
MBROLA
Original author(s)Thierry Dutoit
Developer(s)Vincent Pagel
Initial release1995; 24 years ago (1995)
Repositorygithub.com/numediart/MBROLA
Written inC
Operating systemLinux
Windows
FreeBSD
TypeSpeech synthesizer
LicenseGNU Affero General Public License
Websitetcts.fpms.ac.be/synthesis/mbrola/

MBROLA is speech synthesis software as a worldwide collaborative project. The MBROLA project web page provides diphone databases for a large number [1] of spoken languages.

The MBROLA software is not a complete speech synthesis system for all those languages; the text must first be transformed into phoneme and prosodic information in MBROLA's format, and separate software (e.g. [eSpeakNG]) is necessary.

History[edit]

MBROLA project started in 1995 at TCTS Lab of the Faculté polytechnique de Mons (Belgium) as a scientific project to obtain a set of speech synthesizers for as many languages as possible. First release of mbrola software was in 1996 and was provided as freeware for non-commercial, non-military application [2]. Licenses for created voice databases differ, but are also mostly for non-commercial and non-military use.

Due to its free usage only for non-commercial applications, MBROLA was as alternative choice for private/home users for de-facto speech synthesis engine eSpeakNG in Linux workstations, but mostly was not used for commercial solutions (e.g. for speaking time clocks, boarding notifications for ports and terminals etc.) After initial development of voice databases updates and support of MBROLA software ceased and gradually closed-source binaries fell behind development of recent hardware and operating systems [3]. To deal with this MBROLA development team decided to release MBROLA as open source software, and in October 24, 2018 source code was released on GitHub with GNU Affero General Public License.

Used technology[edit]

MBROLA software uses MBROLA (Multi-Band Resynthesis OverLap Add)[4] algorithm for speech generation. Although it is diphone-based, the quality of MBROLA's synthesis is considered to be higher than that of most diphone synthesisers as it preprocesses the diphones imposing constant pitch and harmonic phases that enhances their concatenation while only slightly degrading their segmental quality.

MBROLA voice sample of Leonhard Euler quote

MBROLA is a time-domain algorithm similar to PSOLA, which implies very low computational load at synthesis time. Unlike PSOLA, however, MBROLA does not require a preliminary marking of pitch periods. This feature has made it possible to develop the MBROLA project around the MBROLA algorithm, through which many speech research labs, companies, or individuals around the world have provided diphone databases for many languages and voices, but there are some notable omissions such as Chinese.

References[edit]

  1. ^ List of MBROLA voices
  2. ^ MBROLA license
  3. ^ Mbrola-64 crashes immediately with a SEGFAULT
  4. ^ Dutoit, T; Leich, H (Dec 1993). "MBR-PSOLA: Text-To-Speech synthesis based on an MBE re-synthesis of the segments database". Speech Communication. 13 (3–4): 435–440. doi:10.1016/0167-6393(93)90042-J.