There are 9 vowels and 36 diphthongs, 28 of which are native to Estonian. All nine vowels can appear as the first component of a diphthong, but only /ɑ e i o u/ occur as the second component. A vowel characteristic of Estonian is the unrounded back vowel /ɤ/, which may be mid back, close back, or mid central.
Simple vowels can be inherently short or long, written with single and double vowel letters respectively. Diphthongs are always inherently long. Furthermore, long vowels and diphthongs have two suprasegmental lengths. This is described under "prosody" further below.
- /n/ is realized as [ŋ] before a velar consonant (e.g. panga /pɑnɡ̊ɑ/ [pɑŋɡ̊ɑ] 'bank [gen.sg.]').
- /f/ and /ʃ/ are considered foreign sounds and they only appear in loanwords. /ʃ/ may be pronounced as [s] by some speakers.
- /b̥ d̥ d̥ʲ ɡ̊/ may be articulated as lax voiceless plosives, or as fully voiced [b d dʲ g]. For example kabi 'hoof' [kɑb̥i] ~ [kɑbi]).
Like the vowels, most consonants can be inherently short or long. For the plosives, this distinction is reflected as a distinction in tenseness/voicing, with short plosives being voiced and long plosives being voiceless. This distinction only applies fully for single consonants after stressed syllables. In other environments, the length or tenseness/voicing distinctions may be neutralised:
- After unstressed syllables or in consonant clusters, only obstruents can be long, other consonants are always short.
- In consonant clusters, voiced plosives are devoiced when next to another obstruent. That is, voiced plosives only occur next to a vowel or a sonorant.
- Word-initially, obstruents are always voiceless, while the remaining consonants are always short. Recent loanwords may have voiced initial plosives, however.
In addition, long consonants and clusters also have two suprasegmental lengths, like the vowels. This is described under "prosody" further below.
Non-phonemic palatalization generally occurs before front vowels. In addition, about 0.15% of the vocabulary features fully phonemic palatalization, where palatalization occurs without the front vowel. A front vowel did historically occur there, but was lost, leaving the palatalization as its only trace (a form of cheshirization). It mostly occurs word-finally, but in some cases it may also occur word-medially. Thus, palatalization does not necessarily need a front vowel, and palatalized vs. plain continuants can be articulated. Palatalization is not indicated in the standard orthography.
The stress in Estonian is usually on the first syllable, as was the case in Proto-Finnic. There are a few exceptions with the stress on the second syllable: aitäh "thanks", sõbranna "female friend". In loanwords, the original stress can be borrowed as well: ideaal "ideal", professor "professor". The stress is weak, and as length levels already control an aspect of "articulation intensity", most words appear evenly stressed.
Syllables can be divided into short and long. Syllables ending in a short vowel are short, while syllables ending in a long vowel, diphthong or consonant are long. The length of vowels, consonants and thus syllables is "inherent" in the sense that it's tied to a particular word, and not subject to morphological alternations.
All stressed long syllables can possess a suprasegmental length feature. When a syllable has this feature, any long vowel or diphthong in the syllable is lengthened further, as is any long consonant or consonant cluster at the end of that syllable. A long syllable without suprasegmental length is termed "long", "half-long", "light" or "length II" and is denoted in IPA as /ˑ/ or /ː/. A long syllable with suprasegmental length is termed "overlong", "long", "heavy" or "length III", denoted in IPA as /ː/ or /ːː/. For consistency, this article employs the terms "half-long" and "overlong" and uses /ː/ and /ːː/ respectively to denote them.
Both the regular short-long distinction and the suprasegmental length are distinctive, so that Estonian effectively has three distinctive vowel and consonant lengths, the distinction between the second and third length levels being at the syllable level rather than the phoneme level. The suprasegmental length is not indicated in the standard orthography except for the plosives, where a single voiceless letter represents a half-long consonant, while a double voiceless letter represents an overlong consonant. There are many minimal pairs and also some minimal triplets which differ only by length, such as:
- vere /vere/ 'blood [gen.sg.]' (short) — veere /veːre/ 'edge [gen.sg.]' (long) — veere /veːːre/ 'roll [imp. 2nd sg.]' (overlong)
- lina /linɑ/ 'sheet' (short) — linna /linːɑ/ 'town [gen. sg.]' (long) — linna /linːːɑ/ 'town [ine. sg.]' (overlong)
Like palatalization, the extra length is traceable to the loss of vowels, usually at the end of a word. Once the vowels were lost, the preceding syllable received compensatory lengthening. Word-final vowels were always lost after long syllables, but were retained after short ones. This is why only long syllables can have the suprasegmental length. Furthermore, single-syllable words with a long syllable always have suprasegmental lengthening.
In Estonian, sounds often alternate. This is generally between various grades of sound length and sound quality in different grammatical forms of a word; see also vowel gradation, consonant gradation, lenition.
The following alternations can be found:
- Presence vs. absence of suprasegmental length
- Long vs. short consonant
- Nasal + plosive vs. long nasal
- Sonorant + /t/ vs. long sonorant
- Plosive vs. fricative or approximant
- Presence vs. absence of a short plosive, possibly with lowering of the vowels next to it
- /s/ vs. /t/
- Asu, Eva Liina; Teras, Pire (2009), "Estonian", Journal of the International Phonetic Association 39 (3): 367–372, doi:10.1017/s002510030999017x
- Ross, Jaan; Lehiste, Ilse (2001), The temporal structure of Estonian runic songs, The Hague: Walter de Gruyter