TIMIT is a corpus of phonemically and lexically transcribed speech of American English speakers of different sexes and dialects. Each transcribed element has been delineated in time.
TIMIT was designed to further acoustic-phonetic knowledge and automatic speech recognition systems. It was commissioned by DARPA and worked on by many sites, including Texas Instruments (TI) and Massachusetts Institute of Technology (MIT), hence the corpus' name. There is also a telephone bandwidth version called NTIMIT (Network TIMIT).
TIMIT and NTIMIT are not freely available — either membership of the Linguistic Data Consortium, or a monetary payment, is required for access to the dataset.
- Fisher, William M.; Doddington,, George R.; Goudie-Marshall, Kathleen M. (1986). The DARPA Speech Recognition Research Database: Specifications and Status. pp. 93–99.