Proto-Indo-European phonology

From Wikipedia, the free encyclopedia
Jump to: navigation, search

The phonology of the Proto-Indo-European language (PIE) has been reconstructed by linguists, based on the similarities and differences among current and extinct Indo-European languages. Because PIE was not written, linguists must rely on the evidence of its earliest attested descendants, such as Hittite, Sanskrit, Ancient Greek, and Latin in order to reconstruct its phonology.

The reconstruction of abstract units of PIE phonological systems (i.e. segments, or phonemes in traditional phonology) is much less controversial than their phonetic interpretation. This especially pertains to the phonetic interpretation of PIE vowels, laryngeals and voiced stops.

Phonemic inventory[edit]

Proto-Indo-European is traditionally reconstructed to have used the following phonemes. See the article on Indo-European sound laws for a summary of how these phonemes reflected in the various Indo-European languages.


Proto-Indo-European consonant segments
Labial Coronal Dorsal Laryngeal
palatal plain labial
Nasal *m *n


*p *t * *k *  
voiced (*b) *d *ǵ *g *  
aspirated * * *ǵʰ * *gʷʰ  
Fricative *s *h₁, *h₂, *h₃
Liquid *r, *l
Semivowel *y [j] *w

The table gives the most common notation in modern publications. Variant transcriptions are given below. Raised ʰ stands for aspiration.


PIE */p/, */b/, */bʰ/ are conveniently grouped with the cover symbol P. The phonemic status of */b/ is disputed: it only appears in handful of reconstructible roots that themselves are often disputed. All of the reconstructed roots with */b/ inside are usually confined to a few Indo-European branches, likely representing late PIE dialectalism.[citation needed]

Some[who?] have attempted to explain away the few roots with */b/ as a result of recent phonological developments. Suggested such developments include

  • *ml- > *bl-, connecting the root *bel- 'power, strength' (> Sanskrit bálam, Ancient Greek beltíōn) with mel- in Latin melior, and *h₂ebl-/*h₂ebōl 'apple' with a hypothetical earlier form *h₂eml-, which is in unmetathesized form attested in another reconstructible PIE word for apple, *méh₂lom (> Hittite maḫla-, Latin mālum, Ancient Greek mēlon).
  • In PIE *ph₃ the *p regularly gives *b in PIE; for example, the reduplicated present stem of *peh₃- 'to drink' > *pi-ph₃- > Sanskrit píbati.

At best, PIE */b/ remains a highly marginal phoneme.


The standard reconstruction identified three coronal/dental stops: */t/, */d/, */dʰ/. They are symbolically grouped with the cover symbol T.

In so-called "thorn clusters" of the form TK in all branches except Anatolian and Tocharian a metathesis occurred, resulting in dorsal-coronal clusters of non-obvious phonetic makeup. Metathetized and unmetathetized forms survive in different ablaut grades of the root *dʰégʷʰ "burn" (whence also English day) in Sanskrit, dáhati "is being burnt" < *dʰégʷʰ-e- and kṣā́yat "burns" < *dʰgʷʰ-éh₁-. See the section on PIE phonological rules, below, for more discussion and examples.


According to the traditional reconstruction, such as the one laid out in Brugmann's Grundriss der vergleichenden Grammatik der indogermanischen Sprachen more than a century ago, three series of velars are reconstructed for PIE:

  • "Palatovelars" (or simply "palatals"), */ḱ/, */ǵ/, */ǵʰ/ (also transcribed */k'/, */g'/, */g'ʰ/ or */k̑/, */g̑/, */g̑ʰ/ or */k̂/, */ĝ/, */ĝʰ/).
  • "Plain velars" (or "pure velars"), */k/, */g/, */gʰ/.
  • Labiovelars, */kʷ/, */gʷ/, */gʷʰ/ (also transcribed */k/, */g/, */gu̯h/). Raised ʷ stands for labialisation (lip-rounding) accompanying the articulation of velar sounds.

The terms "palatovelar" and "plain velar" are in quotes because they are traditional terms but are unlikely to represent the actual pronunciation of these sounds in PIE.[citation needed] One current idea is that the "palatovelars" were in fact simple velars, i.e. *[k], *[g], *[gʰ], while the "plain velars" were pronounced farther back, perhaps actually uvular consonants, i.e. *[q], *[ɢ], *[ɢʰ].[citation needed] Meanwhile, the labiovelars were exactly like the "plain velars" but labialized, i.e. *[qʷ], *[ɢʷ], *[ɢʷʰ] in the current understanding.[citation needed] These conclusions are suggested by the following evidence:

  • The "palatovelar" series was the most common; meanwhile the "plain velar" was by far the least common, and never occurred in any affixes. In known languages with multiple velar series, the normal velar series is usually the most common.
  • There is no evidence whatsoever of there ever having been any palatalization in the early history of the velars in any of the Centum branches. If the "palatovelars" were in fact palatalized in PIE, there would have had to be a single, very early, uniform depalatalization in all (and only) the Centum branches — depalatalization is cross-linguistically far less likely than palatalization, and hence unlikely to have occurred separately in each Centum branch, and if such a situation did occur, it almost certainly would have left evidence of prior palatalization in some of the branches. However, there is no evidence at all that the Centum branches ever formed a clade (i.e. possess a common ancestor that is later than PIE as a whole, in which the putative depalatalization would have occurred). Quite the contrary, some evidence indicates that Anatolian and Tocharian, two of the Centum branches (note, however, that the Luwian branch of Anatolian seems to have satem reflexes, and Melchert reconstructs the palatal stops traditionally assumed for PIE for Proto-Anatolian, as well), were the first and second branches, respectively, to have split off from PIE, but this is uncertain and disputed.
Satem and Centum languages[edit]
Main article: Centum-Satem isogloss

The Satem group of languages merged the labiovelars *kʷ, *gʷ, *gʷʰ with the plain velar series *k, *g, *gʰ, while the palatovelars *ḱ, *ǵ, *ǵʰ became sibilant fricatives or affricates of various types, depending on the individual language. In some phonological conditions depalatalization occurred, yielding what appears to be a Centum reflex in a Satem language. For example, in Balto-Slavic and Albanian, palatovelars were depalatalized before resonants unless the latter were followed by a front vowel. The reflexes of the labiovelars are generally indistinguishable from those of the plain velars in Satem languages, but there are some words where the lost labialization has left a trace, such as by u-coloring a following vowel.

The Centum group of languages, on the other hand, merged the palatovelars *ḱ, *ǵ, *ǵʰ with the plain velar series *k, *g, *gʰ, while the labiovelars *kʷ, *gʷ, *gʷʰ were kept distinct. Analogous to the depalatalization of the Satem languages, the Centum languages show delabialisation of labiovelars when adjacent to *w (or its allophone *u), according to a rule known as the boukólos rule.

If the palatovelar and plain velar series were in fact velar and uvular respectively, then the split between the Centum and Satem groups would not have been a straightforward loss of an articulatory feature (palatalization or labialization). Instead, under this interpretation, the uvulars *q, *ɢ, *ɢʰ (the "plain velars" of the traditional reconstruction) were fronted to velars across all branches. In the Satem languages this caused a chain shift, where the existing velars (traditionally "palatovelars") were shifted further forward to avoid a merger, becoming palatal: /k/ > /c/; /q/ > /k/. In the Centum languages, no chain shift occurred, and the uvulars merged into the velars. The delabialization in the Satem languages would have occurred later, in a separate stage.

Three velar series[edit]

The existence of all three dorsal columns (series) has been disputed since the beginning of Indo-European studies. Today, most PIE linguists believe that all three series were distinct by the time of Late Proto-Indo-European, although a minority believe that the distinction between plain velar and palatovelar consonants was a later development of certain Satem languages; this belief was originally articulated by Antoine Meillet in 1894 and argued more recently by Frederik Kortlandt and others.[1] This argument contends that PIE had only two series, a simple velar and a labiovelar. The Satem languages palatalized the plain velar series in most positions, but the plain velars remained in some environments. These environments are typically reconstructed as before or after /u/, after /s/, and before /r/ or /a/; also apparently before /m/ and /n/ in some Baltic dialects. (This is conceptually similar to the change in Proto-Germanic whereby e.g. /t/ became /θ/ in most instances, but remained as /t/ after original /s/, /k/ or /p/.) The original allophonic distinction was disturbed when the labiovelars were merged with the plain velars. This produced a new phonemic distinction between palatal and plain velars, with an unpredictable alternation between palatal and plain in related forms of some roots (those from original plain velars) but not others (those from original labiovelars). Subsequent analogical processes generalized either the plain or palatal consonant in all forms of a particular root. Those roots where the plain consonant was generalized are those traditionally reconstructed as having "plain velars" in the parent language, in contrast to "palatovelars".

The basic arguments in favor of two velar series are:

  • The plain velar series is statistically rarer than the other two, is almost entirely absent from affixes, and appears most often in certain phonological environments (described above).
  • Alternations between plain velars and palatals are common in a number of roots across different Satem languages, where the same root appears with a palatal in some languages but a plain velar in others (most commonly Baltic or Slavic; occasionally Armenian, but rarely or never the Indo-Iranian languages). This is consistent with the analogical generalization of one or another consonant in an originally alternating paradigm, but difficult to explain otherwise.
  • The above explanation suggests that in Late PIE times the Satem languages were in close contact with each other. This is confirmed by independent evidence: The geographical closeness of current Satem languages and certain other shared innovations (the Ruki sound law and early palatalization of velars before front vowels).
  • The traditional explanation of a three-way dorsal split requires that all Centum languages share a common innovation that eliminated the palatovelar series. Unlike for the Satem languages, however, there is no evidence of any areal connection among the Centum languages, and in fact there is evidence against such a connection—the Centum languages are geographically noncontiguous. Furthermore, if such an areal innovation happened, we would expect to see some dialect differences in its implementation (cf. the above differences between Balto-Slavic and Indo-Iranian), and residual evidence of a distinct palatalized series (such evidence for a distinct labiovelar series does exist in the Satem languages; see below). In fact, however, neither type of evidence exists, suggesting that there was never a palatovelar series in the Centum languages.

The basic arguments in favor of three velar series are:

  • Many instances of plain velars occur in roots that have no evidence of any of the putative environments that trigger plain velars, and no obvious mechanism for the plain velar to have come in contact with any such environment; as a result, the comparative method requires us to reconstruct three series.
  • Evidence from the Anatolian language Luwian attests a three-way velar distinction *ḱ > z (probably [ts]); *k > k; *kʷ > ku (probably [kʷ]).[2] There is no evidence of any connection between Luwian and any Satem language (labiovelars are still preserved, Ruki sound law is absent), and the Anatolian branch split off very early from PIE. Hence, the three-way distinction must be reconstructed for the parent language. (This is a strong argument in favor of the traditional three-way system; in response, proponents of the two-way system have attacked the underlying evidence, claiming that it "hinges upon especially difficult or vague or otherwise dubious etymologies" (e.g. Sihler 1995).) Melchert originally claimed that the change *ḱ > z was unconditional, and subsequently revised the assertion to a conditional change occurring only before front vowels, /y/, or /w/; however, this does not fundamentally alter the situation, as plain-velar *k apparently remains as such in the same context. Melchert also asserts that, contrary to Sihler, the etymological distinction between *ḱ and *k in the relevant positions is well-established.[3]
  • According to Ringe (2006), there are root constraints that prevent the occurrence of a "palatovelar" and labiovelar, or two "plain velars", in the same root; but these do not apply to roots containing, e.g. a palatovelar and plain velar.

It should be noted that there is residual evidence of various sorts in the Satem languages of a former distinction between velar and labiovelar consonants:

  • In Sanskrit and Balto-Slavic, in some environments, resonant consonants (denoted by /R/) become /iR/ after plain velars but /uR/ after labiovelars.
  • In Armenian, some linguists assert that /kʷ/ is distinguishable from /k/ before front vowels[4]
  • Some linguists assert that in Albanian, /kʷ/ and /gʷ/ are distinguishable from /k/ and /g/ before front vowels[5]

This evidence shows that the labiovelar series was distinct from the plain velar series in PIE, and cannot have been a secondary development in the Centum languages; but it says nothing about the palatovelar vs. plain velar series.

In addition, modern proponents of the three-way distinction do not deny the final two points made in the arguments in favor of the two-way distinction, concerning the unity of the Satem group and lack of such unity in the Centum group. Rather, they claim that the Centum change did indeed occur independently in multiple Centum subgroups (at the very least, Tocharian, Anatolian and Western IE), but was a phonologically natural change given the current interpretation of the "palatovelar" series as plain-velar and the "plain velar" series as back-velar or uvular, and given the minimal functional load of the plain-velar/palatovelar distinction. Since there was never any palatalization in the IE dialects leading to the Centum languages, there is no reason to expect any palatal residues; furthermore, it is phonologically entirely natural that a former plain-velar vs. back-velar/uvular distinction would leave no distinctive residues on adjacent segments.

It is quite possible to use the traditional three-way distinction while remaining agnostic on the issue of whether it represents the actual state of the parent language or is an artifact of later developments in the Satem branch. It is also quite possible to take a compromise position asserting that the three-way distinction did indeed exist in late PIE but simultaneously did in fact develop from an earlier two-way distinction through the same mechanism and in the same environments traditionally claimed to have triggered the plain/palatal distinction in the Satem languages.


The only certain PIE fricative phoneme */s/ was a strident sound, whose phonetic realization could possibly range from [s] to palatalized [ɕ] or [ʃ]. It had a voiced allophone *z that emerged by assimilation in words such as *nisdós 'nest', and which later became phonemicized in some daughter languages. Some PIE roots have variants with *s appearing initially: such *s is called s-mobile.

The "laryngeals" may have been fricatives, but there is no consensus as to their phonetic realization.


Main article: Laryngeal theory

The phonemes */h₁/, */h₂/, */h₃/, with cover symbol H also denoting "unknown laryngeal" (or ə₁, ə₂, ə₃ and ə), stand for three "laryngeal" phonemes. One should note that the term laryngeal as a phonetic description is out of date, retained only because its usage has become standard in the field.

Phonetic value of the laryngeal phonemes is disputable; various suggestions for their exact phonetic value have been made, ranging from cautious claims that all that can be said with certainty is that */h₂/ represented a velar fricative pronounced far back in the mouth, and that *h₃ exhibited lip-rounding up to more definite proposal; e.g. Meier-Brügger writes that realizations of *h₁ = [h], *h₂ = [χ] and *h₃ = [ɣ] or [ɣʷ] "are in all probability accurate".[6] Other commonly cited speculations for *h₁ *h₂ *h₃ are ʔ ʕ ʕʷ (e.g. Beekes) and x χ~ħ xʷ.[who?] It is sometimes claimed[citation needed] that *h₁ may have been two consonants, ʔ and h, that fell together. A consensus seems to be emerging, however, that *h₁ is unlikely to have been a glottal stop /ʔ/, as all three laryngeals pattern similarly to each other and to fricatives in other languages (and similarly to PIE /s/, the only other fricative). It is possible, however, that all three laryngeals ultimately fell together as a glottal stop in some languages. Evidence for this development in Balto-Slavic comes from the eventual development of post-vocalic laryngeals into a register distinction commonly described as "acute" (vs. "circumflex" register on long vocalics not originally closed by a laryngeal) and marked in some fashion on all long syllables, whether stressed or not; furthermore, in some circumstances original acute register is reflected by a "broken tone" (i.e. glottalized vowel) in modern Latvian.

The schwa indogermanicum symbol ə is commonly used for a laryngeal between consonants, in a "syllabic" position.

Glottalic theory[edit]

Main article: Glottalic theory

The phonetical values of the three stop series are traditionally reconstructed as voiceless (e.g. */t/), voiced (e.g. */d/) and voiced aspirated (e.g. */dʰ/). However, this system is not found in any descendant language (Sanskrit still has all three, but has added a fourth series of voiceless aspirated, e.g. /tʰ/), and is vanishingly rare in any recorded languages. The rarity of */b/ is also unusual. Additionally, PIE roots have a constraint which prohibits roots mixing voiceless and voiced aspirate stops, as well as roots containing two voiced stops. These facts have led some scholars to reassess this part of the reconstruction, replacing the voiced stops by glottalized and the voiced aspirated stops by plain voiced. Direct evidence for glottalization is limited, but there is some indirect evidence, including Winter's law in Balto-Slavic, and in the fact that the voiceless consonants and the voiced aspirate consonants develop in parallel in Germanic, with both becoming fricatives while the glottalised (plain voiced in traditional theory) consonants remain stops.


In a phonological sense, sonorants in Proto-Indo-European were those segments that could appear both in the syllable nucleus (i.e. they could be syllabic) and out of it (i.e. they could be non-syllabic). PIE sonorants are the liquids, nasals and glides: */r/, */l/, */m/, */n/, */y/ (or *i̯), */w/ (or *u̯), all grouped with the cover symbol R.

All of them had allophones in a syllabic position, which is generally between consonants, word-initially before consonants and word-finally after a consonant. They are marked as: *r̥, *l̥,*m̥, *n̥, *i, *u. One should note that, even though *i and *u were phonetically certainly vowels, phonologically they were syllabic sonorants.


  • Proto-Celtic, Albanian, Proto-Balto-Slavic and Proto-Iranian merged the voiced aspirated series */bʰ/, */dʰ/, */ǵʰ/, */gʰ/, */gʷʰ/ with the plain voiced series */b/, */d/, */ǵ/, */g/, */gʷ/. (In Proto-Balto-Slavic this postdated Winter's law. Proto-Celtic retains the distinction between */gʷʰ/ and */gʷ/ - the former became */gw/ while the latter became */b/.)
  • Proto-Germanic underwent Grimm's law, changing voiceless stops into fricatives, devoicing unaspirated voiced stops, and fricativizing and deaspirating voiced aspirates.
  • Grassmann's law (Tʰ-Tʰ > T-Tʰ, e.g. dʰi-dʰeh₁- > di-dʰeh₁-) and Bartholomae's law (TʰT > TTʰ, e.g. budʰ-to- > bud-dʰo-) describe the behaviour of aspirates in particular contexts in some early daughter languages.

Sanskrit, Greek, and Germanic, along with Latin to some extent, are the most important for reconstructing PIE consonants, as all of these languages keep the three series of stops (voiceless, voiced and voiced-aspirated) separate. In Germanic, Verner's law and changes to labiovelars (especially outside of Gothic) obscure some of the original distinctions; but on the other hand, Germanic is not subject to the assimilations of Grassmann's law, which affects both Greek and Sanskrit. Latin also keeps the three series separate, but largely obscures the distinctions among voiced-aspirated consonants in initial position (all except /gh/ become /f/) and collapses many distinctions in medial position. Greek is especially important for reconstructing labiovelars, as other languages tend to delabialize them in many positions.

Anatolian and Greek are the most important languages for reconstructing the laryngeals. Anatolian directly preserves many laryngeals, while Greek preserves traces of laryngeals in positions (e.g. at the beginning of a word) where they disappear in many other languages, and reflects each laryngeal different from the others (the so-called triple reflex) in most contexts. Balto-Slavic languages are sometimes important in reconstructing laryngeals, since they are fairly directly represented in the distinction between "acute" and "circumflex" vowels. Old Avestan faithfully preserves numerous relics (e.g. laryngeal hiatus, laryngeal aspiration, laryngeal lengthening) triggered by ablaut alternations in laryngeal-stem nouns, but the paucity of the Old Avestan corpus prevents it from being more useful. Vedic Sanskrit preserves the same relics rather less faithfully, but in greater quantity, making it sometimes useful.



It is disputed how many vowels Proto-Indo-European (PIE) had, as well as what counts as a "vowel" in that language. It is generally agreed that at least four vowel segments existed, normally denoted as */e/, */o/, */ē/ and */ō/. All of these vowels are morphologically conditioned to varying extents. The two long vowels are less common than the short vowels and their morphological conditioning is especially strong, suggesting that in an earlier stage there may not have been a length opposition, and a system with as few as two vowels (or even only one vowel, according to some researchers) may have existed.

In addition, the surface vowels *i and *u were extremely common, and syllabic sonorants *r̥, *l̥, *m̥, *n̥ existed. All of these alternate in a syllabic position with sonorant consonants *y, *w, *r, *l, *m, *n. For example, the root of the PIE word *yugóm "yoke" with a *u also appears in the verb *yewg- "to yoke, harness, join" with *w. Similarly, the PIE word *dóru "tree, wood" is reconstructed with genitive singular *dréws and dative plural *drúmos. Some authors (e.g. Ringe (2006)) have argued that there is strong evidence for reconstructing a non-alternating phoneme *i in an addition to an alternating phoneme *y, as well as weaker evidence for a non-alternating phoneme *u.

In addition, all daughter Indo-European languages have a segment */a/, and those languages with long vowels generally have long /aː/ /iː/ /uː/. Up until the mid-20th century, PIE was reconstructed with all of these vowels. Modern versions incorporating the laryngeal theory, however, tend to view these vowels as later developments of sounds that should be reconstructed in PIE as larnygneals *h₁, *h₂ h₃. For example, what used to be reconstructed as PIE is now reconstructed as *eh₂; *ī, *ū are now reconstructed as */iH/ */uH/, where *H represents any laryngeal; and *a has various origins, among which are a "syllabic" [H̩] (i.e. any laryngeal when not adjacent to a vowel, or an *e next to the "a-coloring" laryngeal *h₂e. Some researchers, however, have argued that a phoneme *a must be reconstructed that cannot be traced back to any laryngeal.

Any of the sonorant consonants can comprise the second part of a complex syllable nucleus, i.e. they can form diphthongs with any of the vowels *e, *o, *ē, *ō; e.g. *ey, *oy, *ēy, *ōy, */ew/, */ow/, */em/, */en/, etc.

Lengthened vowels[edit]

In certain morphological (e.g., as a result of Proto-Indo-European ablaut) and phonological conditions (e.g. in the last syllable of nominative singular of a noun ending on sonorant, in root syllable in sigmatic aorist etc.; cf. Szemerényi's law, Stang's law) vowels *e and *o would lengthen, yielding respective lengthened-grade variants. Basic, lexical forms of words in PIE contain therefore only short vowels; on the basis of well-established morphophonological rules forms with long vowels *ē and *ō appear.

Lengthening of vowels may have been a phonologically conditioned change in Early Proto-Indo-European, but at the period just before the dissolution of Proto-Indo-European speaking community, which is usually reconstructed, it is not possible to phonologically predict the appearance of all long vowels, because the phonologically justified resulting long vowels have begun to spread analogically to other forms in which they were not phonologically justified. Hence, the prosodically long */e/ in *ph₂tḗr 'father' results by the application of Szemerényi's law, a synchronic phonological rule that operated within the PIE, but prosodically long */o/ in *pṓds 'foot' is analogically leveled.


It is possible that Proto-Indo-European had a few morphologically isolated words that contained the vowel *a, e.g. *dap- 'sacrifice' (Latin daps, Ancient Greek dapánē, Old Irish dúas); or appearing as a first part of a diphthong *ay, e.g. *laywos 'left' (Latin laevus, Ancient Greek laiós, OCS lěvъ). The phonemic status of *a has been fiercely disputed; for example Beekes[7] expressly concludes: There are thus no grounds for PIE phoneme *a, and the same conclusion is reached by his former student Alexander Lubotsky.[8] After the discovery of Hittite and the advent of laryngeal theory, basically every instance of previous *a could be reduced to the vowel *e either preceded or followed by the laryngeal *h₂ (rendering the previously reconstructed short and long *a, respectively). Against the possibility of PIE phoneme *a, that is even today held by some Indo-Europeanists, the following can be said: vowel *a does not participate in ablaut alternations (i.e. it does not alternate with other vowels, as the "real" PIE vowels *e, *o, *ē, *ō do), it makes no appearance in suffixes and endings, it appears in very confined set of positions (usually after initial *k, which could be the result of that phoneme being a-coloring—particularly likely if it was in fact uvular /q/) and the reflexes of words upon which *a is reconstructed are usually confined only to a few Indo-European languages which makes it possible to ascribe it to some late PIE dialectalism, or are of expressive character thus not being suitable for comparative analysis, or are argued to have been borrowed from some other language which had phonemic *a (e.g. Proto-Semitic *θawru > PIE *táwros "wild bull, aurochs").

However, others, like Mayrhofer,[9] argue that PIE did in fact have *a and phonemes independent of *h₂.


Ancient Greek reflects the original PIE vowel system most faithfully, with few changes to PIE vowels in any syllable; however, loss of certain consonants, especially */s/, */w/ and */y/, often triggers compensatory lengthening or contraction of vowels in hiatus, which can complicate reconstruction.

Sanskrit and Avestan merge */e/, */a/ and *o into a single vowel */a/ (with a corresponding merger in the long vowels), but reflect PIE length differences (especially due to ablaut) even more faithfully than Greek, and do not have the same issues with consonant loss that Greek does. Furthermore, /o/ can often be reconstructed through Brugmann's law, and /e/ through the "law of palatals" (see Proto-Indo-Iranian language).

Germanic languages show merger of long and short */a/ and */o/, as well as the merger of */e/ and */i/ in non-initial syllables, but (especially in the case of Gothic) are still important for reconstructing PIE vowels. Balto-Slavic languages are similar, again showing merger of short */a/ and */o/ (and for Slavic languages, also long */a/ and */o/).

Evidence from Anatolian and Tocharian can be important due to the archaism of these languages, but is often difficult to interpret; Tocharian, especially, has complex and far-reaching vowel innovations.

Italic languages and Celtic languages do not unilaterally merge any vowels, but have such far-reaching vowel changes (especially in the case of the Celtic languages) that they are somewhat less useful for PIE. Albanian and Armenian are least useful, as they are attested relatively late, have borrowed heavily from other languages, and have complex and ill-understood vowel changes.

In Proto-Balto-Slavic short PIE vowels were preserved, with the change of */o/ > */a/ as in Proto-Germanic. A separate reflex of the original *o or *a is however argued to have been retained in some environments as a lengthened vowel, due to the effect of Winter's law. Subsequently, Early Proto-Slavic merged *ō and *ā, which were retained in Baltic languages. Additionally, accentual differences in some Balto-Slavic languages indicate whether the post-PIE long vowel originated from a genuine PIE lengthened grade, or is it a result of "laryngeal coloring" mechanism.


PIE had a free pitch accent, which could appear on any syllable and whose position often varied among different members of a paradigm (e.g. between singular and plural of a verbal paradigm, or between nominative/accusative and oblique cases of a nominal paradigm). The location of the pitch accent is closely associated with ablaut variations, especially between normal-grade vowels (/e/ and /o/) and zero-grade vowels (i.e. lack of a vowel).

Generally, thematic nouns and verbs (those with a "thematic vowel" between root and ending, usually /e/ or /o/) had a fixed accent, which (depending on the particular noun or verb) could be either on the root or the ending. These words also had no ablaut variations within their paradigms. (However, accent and ablaut were still associated; for example, thematic verbs with root accent tended to have e-grade ablaut in the root, while those ending accent tended to have zero-grade ablaut in the root.) On the other hand, athematic nouns and verbs usually had mobile accent, with varied between strong forms, with root accent and full grade in the root (e.g. the singular active of verbs, and the nominative and accusative of nouns), and weak forms, with ending accent and zero grade in the root (e.g. the plural active and all forms of the middle of verbs, and the oblique cases of nouns). Some nouns and verbs, on the other hand, had a different pattern, with ablaut variation between lengthened and full grade and mostly fixed accent on the root; these are termed Narten stems. Additional patterns exist for both nouns and verbs. For example, some nouns (so-called acrostatic nouns, one of the oldest classes of noun) has fixed accent on the root, with ablaut variation between o-grade and e-grade, while hysterodynamic nouns have zero-grade root with a mobile accent that varies between suffix and ending, with corresponding ablaut variations in the suffix.

The accent is best preserved in Vedic Sanskrit and (in the case of nouns) Ancient Greek. It is also reflected to some extent in the accentual patterns of the Balto-Slavic languages (e.g. Latvian, Lithuanian and Serbo-Croatian). It is indirectly attested in a number of phenomena in other PIE languages, especially the Verner's law variations in the Germanic languages. In other languages (e.g. the Italic languages and Celtic languages) it was lost without a trace. Other than in Modern Greek, the Balto-Slavic languages and (to some extent) Icelandic, few traces of the PIE accent remain in any modern languages.

Phonological rules[edit]

A number of phonological rules can be reconstructed for Proto-Indo-European. Some of them are disputed to be valid for "PIE proper", and are claimed to be later innovations in some of the daughter branches. Some of these laws are:

  1. Bartholomae's law: TʰT > TTʰ
    Passive participle of *bʰewdʰ 'to learn, become aware of': *bʰudʰ-to- > *bʰud-dʰo- > (Grassmann's law) Sanskrit buddhá.
    Law has been preserved in Indo-Iranian branch where it operates as a synchronic rule. There are some traces of it in Ancient Greek and Germanic, and possibly in Latin.
  2. Dental assibilation: TT > TsT (a sequence of two dental stops had dental fricative */s/ inserted between them)
    *h₁ed-ti 'eats' > *h₁etsti > Hittite ezzi.
    This has been preserved in Hittite where cluster *tst is spelled as z (pronounced as [ts]). The cluster was often simplified to -ss- in the later descendants (Latin and Germanic among others).
  3. TK > KT > "Kþ" ("thorn clusters"): Dental stops that were placed behind PIE dorsals in the same syllable metathesized in all branches except in Tocharian and Anatolian (the earliest one that were to split from PIE matrix). Subsequent outcomes were varied.
    *h₂ŕ̥tḱos 'bear' > *h₂ŕ̥ḱþos > Latin ursus, Ancient Greek árktos, Sanskrit ṛ́kṣas but Hittite ḫartaggas /ḫartkas/ without metathesis.
    *dʰgʷʰítis 'decaying, decline, ruin' > *gʷʰþítis > Ancient Greek phthísis, Sanskrit kṣítis, perhaps Latin sitis
  4. Siebs' law: If s-mobile is added to the root that starts with voiced or aspirated stop, that stop is devoiced.
    *bʰr̥Hg- > Latin fragor, but *sbʰr̥Hg- > *spr̥Hg- > Sanskrit sphūrjati
  5. Stang's law: *Vwm > *Vːm; i.e. */w/ disappears and the preceding vowel lengthens in the last syllable behind word-final */m/. Some also add rules: *Vmm > *Vːm and *Vh₂m > *Vːm; and also *Vyi > *Vːy.
    *dyéwm 'sky' (accusative singular) > *dyḗm > Sanskrit dyā́m, acc. sg. of dyaús
    *gʷowm 'cattle' (acc. sg.) > *gʷōm > Sanskrit gā́m, acc. sg. of gaús
    accusative singular of *dom- 'house' is *dṓm, not **dómm̥.
  6. Szemerényi's law: -VRs > VːR, -VRh₂ > VːR i.e. in word-final sequences of vowel, sonorant and */s/ or */h₂/ the fricative or laryngeal was dropped and the preceding vowel lengthened. This affected nominative singulars of numerous masculine and feminine nouns, as well as the nominoaccusative of neuter collectives.
    *ph₂tér-s 'father' > *ph₂tḗr > Ancient Greek patḗr, Sanskrit pitā́
  7. Laryngeal deletion rules: See below.

Thorn clusters[edit]

A problem in the reconstruction of PIE concerns some cognate sets in which Indo-Iranian sibilants in clusters with dorsals exceptionally correspond to coronal stops in certain other branches. 'Bear' and 'decaying' above are examples; some others are Sanskrit tákṣan 'artisan' vs. Greek téktōn 'carpenter', and Sanskrit kṣā́ḥ vs. Greek khthon both 'earth'. As was the case with the laryngeal theory, these cognate sets were first noted prior to the connection of Anatolian and Tocharian to PIE, and early reconstructions posited a new series of consonants to explain these correspondences. Brugmann 1897's systematic explanation augmented the PIE consonant system with a series of interdentals (nowhere directly attested) appearing only in clusters with dorsals, *kþ *khþh *gð *ghðh. The use of the letter thorn led to the name "thorn cluster" for these groups.

Anatolian and Tocharian evidence suggests that the original form of the thorn clusters was in fact *TK: Hittite has tēkan, tagnās, dagān and Tocharian A tkaṃ, tkan- for case-forms of 'earth', so that the development outside Anatolian and Tocharian involved a metathesis. The conventional notations *þ *ðʰ for the second elements of these metathesised clusters are still found, and some, including Fortson,[10] continue to hold to the view that interdental fricatives were involved at some stage of PIE.

An alternative interpretation (e.g. Vennemann 1989, Schindler 1991 (informally and unpublished)[11]) identifies these segments as alveolar affricates [t͡s d͡z]. In this view, thorn clusters developed as TK > TsK > KTs and then variously in daughter languages; this has the advantage that the first change can be identified with the dental assibilation rule above, which is then broadened in application to affrication of dental stops before any stops. Melchert has interpreted the Cuneiform Luvian īnzagan- 'inhumation', probably [ind͡zgan], from *en dʰgʰ˘ōm 'in the earth', as preserving the intermediate stage of this process.[10]

Laryngeal deletion rules[edit]

Once the laryngeal theory was developed, and the rules for sound change of laryngeals worked out, it was clear that there were a number of exceptions to the rules, in particular with regard to "syllabic" laryngeals (former "schwa indogermanicum") that occurred in non-initial syllables. It was long suggested that such syllabic laryngeals were simply deleted in certain of the daughters; this is based especially on the PIE word dhugh₂tér- "daughter", which appears in a number of branches (e.g. Germanic, Balto-Slavic) with no vowel in place of expected /a/ for "syllabic" /h₂/ (cf. English "daughter", Gothic daúhtar). With a better understanding of the role of ablaut, however, and a clearer understanding of which roots did and did not have laryngeals in them, it became clear that this suggestion cannot be correct. In particular, there are some cases where syllabic laryngeals in medial syllables delete in most or all daughter languages, and other cases where they do not delete even in Germanic and/or Balto-Slavic.

This has led to the more recent idea that PIE had a number of synchronic "laryngeal deletion" rules, where syllabic laryngeals in certain contexts were deleted even in the protolanguage. In the case of dhugh₂tér-, for example, it appears that PIE had an alternation between a "strong" stem dhugh₂tér- and a "weak" stem dhugtr-, where a deletion rule eliminated the laryngeal in the latter context but not the former one. Forms in daughter languages with or without the laryngeal are due to analogical generalization of one or the other protoforms.

This is a new area, and as a result there is no consensus on the number and nature of the deletion rules. A wide variety of rules have been proposed; Ringe (2006) identifies the following three as the most likely candidates (where C=any consonant, V=any vowel, H=any laryngeal, R=any resonant):

  1. A laryngeal in the sequence oRHC was dropped. Example: *tórmos "borehole" from *terh₁- "bore" (cf. Gk tórmos "socket", OE þearm "intestine"). This seems to have operated particularly in the thematic optative suffix -oy-h₁-, which was reduced to -oy- in most forms.
  2. A laryngeal in the sequence VCHy was dropped. Examples: *wérye- "say" (present tense) from *werh₁- (cf. Homeric Greek eírei "(he) says", not *eréei); h₂érye- "plow" (present tense) from h₂erh₃- "plow" (cf. Lith. ãria "(he) plows", not *ária).
  3. A laryngeal in the sequence CH-CC was dropped, where a syllable boundary follows the laryngeal (i.e. the following two consonants are capable of occurring at the start of a word, as in tr- but not rt-). An example is the weak stem dhugtr- given above, compared to the strong stem dhugh₂tér-.

It seems unlikely that this is a correct and complete description of the actual phonological rules underlying laryngeal deletion. These rules do not account for all the potential cases of laryngeal deletion (hence the many other rules that have been proposed); for example, the laryngeal in the desiderative suffixes -h₁s- and -h₁sy- appears to delete after an obstruent but not a resonant. In any case it is difficult to determine when a particular laryngeal loss is due to a protolanguage rule vs. an instance of later analogy. In addition, as synchronic phonological rules the set of above rules is more complex than what is expected from a cross-linguistic standpoint, suggesting that some of the rules may have already been "morphologized" (incorporated into the morphology of certain constructions, such as the o-grade noun-forming rule or the rule forming y-presents); the above-mentioned laryngeal deletion in the desiderative suffixes may be an example of such morphologization.


Further information: Proto-Indo-European root


Main article: Indo-European ablaut

Phonetic correspondences in daughter languages[edit]

The correlations among the Indo-European languages are for the most part straightforward, but there are some complications with the velar consonants. The languages divide into two groups, known as the Centum languages and Satem languages, based on the respective words for "hundred" in representative languages of each group (Latin and Avestan, respectively). Each group merges the PIE "plain velars" with one of the other two series, the Centum group merging "palatals" and "plain velars" while the Satem group merges labiovelars and "plain velars", removing the labialization in the process. The Satem group furthermore converts the "palatal" series into sibilant-type sounds. The following table summarizes the outcomes in the various daughters:

PIE *ḱ *ǵʰ *kʷ *gʷ *gʷʰ
Celtic k g g kw, p[* 1] b gw
Italic k g g, h[* 2] kw, p[* 3] gw, v, b[* 3] f, v
Venetic k g h kw ? ?
Hellenic k g kh p, t, k[* 4] b, d, g[* 4] ph, kh, th[* 4]
Albanian s,[* 5] (k) z,[* 5] (g) z,[* 5] (d) k, s g, z g, z
Illyrian s[* 5] z[* 5] z[* 5] ? ? ?
Thracian s[* 5] z[* 5] z[* 5] k, kh g, k g
Armenian s c dz kh k g
Phrygian k[* 6] g[* 6] g, k k b g
Germanic h k g ~ ɣ[* 7] hw kw gw[* 8] ~ w[* 7]
Slavic s z z k g g
Baltic š ž ž k g g
Indic ç[* 9] h[* 10] k, č g gh
Iranian s z z[* 10] k, č g g
Anatolian k[* 11] g[* 12] g[* 12] kw gw[* 13] gw[* 13]
Tocharian k k k k, kw k, kw k, kw
  1. ^ Within Celtic, the "p-Celtic" and "q-Celtic" branches have different reflexes of PIE *kʷ: *ekwos → ekwos, epos. The Brythonic and Lepontic languages are p-Celtic; Goidelic and Celtiberian are q-Celtic; while different dialects of Gaulish had different realizations.
  2. ^ PIE *ǵʰ → Latin /h/ or /g/, depending on its position in the word, and → Osco-Umbrian *kh → /h/.
  3. ^ a b PIE *kʷ and *gʷ developed differently in the two Italic subgroups: /kw/, /gw/ in Latin (*kwis → kwis), and /p/, /b/ in Osco-Umbrian (*kwis → pis).
  4. ^ a b c PIE *kʷ, *gʷ, *gʷʰ have three reflexes in Greek dialects such as Attic and Doric:
    /t, d, th/ before /e, é, i/ (IE *kʷis → Greek tis)
    /k, g, kh/ before /u/ (IE *wl.kʷos → Greek lukos)
    /p, b, ph/ before /a, o/ (IE *sekʷ- → Greek hep-)
    However, in Mycenaean Greek *kw remained, and in Aeolian it leveled to /p/.
  5. ^ a b c d e f g h i PIE *ḱ, *ǵ, *ǵʰ were originally reflected in Balkan languages as spirants /þ/ and /ð/, later in Albanian turning into /s/, /z/.
  6. ^ a b The Phrygian evidence is limited and often ambiguous, so the issue of kentum vs. satem reflexes is not completely settled; but the prevailing opinion is that Phrygian shows kentum reflexes along with secondary palatalisation of /k/ to /ts/ and /g/ to /dz/ before front vowels, as in most Romance languages, see Phrygian language#Phonology.
  7. ^ a b Proto-Germanic reflexes of Indo-European voiced stops had spirant allophones, retained in intervocalic position in Gothic.
  8. ^ This is the reflex after nasals. The outcome in other positions is disputed and may vary according to phonetic environment. See the note in Grimm's law#In detail.
  9. ^ Indic languages, ç represents a palatalized š sound
  10. ^ a b pIE *ǵʰ → proto-Indo-Iranian *džjh → Indic /h/, Iranian /z/.
  11. ^ In Luvic languages, yielding ts, at least under most circumstances.
  12. ^ a b In Luvic languages, usually becoming *y initially, lost intervocalically.
  13. ^ a b In Luvic languages, yielding simple w.

In p-Celtic, Osco-Umbrian, and Aeolian Greek, *kw > /p/. This may be due to contact, perhaps in the Balkan region in the second millennium BC. The same /p/ also occurs in Hittite in a few pronominal forms (pippid "something, someone", cf. Latin quisquid).


  1. ^ e.g. Szemerényi (1995), Sihler (1995)
  2. ^ Craig Melchert (1987). "PIE velars in Luvian". Studies in Memory of Warren Cowgill. pp. 182–204. Retrieved 2008-10-27. 
  3. ^ Craig Melchert (to appear). "The Position of Anatolian". Handbook of Indo-European Studies. pp. 18–22. Retrieved 2010-06-29. 
  4. ^ Holger Pedersen, KZ 36 (1900) 277-340; Norbert Jokl, in: Mélanges linguistiques offerts à M. Holger Pedersen (1937) 127-161.
  5. ^ Vittore Pisani, Ricerche Linguistiche 1 (1950) 165ff.
  6. ^ Meier-Brügger, Michael (2003). Indo-European Linguistics. p. 107. ISBN 3-11-017433-2. 
  7. ^ Beekes 1995:139
  8. ^ Alexander Lubotsky. "Against a Proto-Indo-European phoneme *a". 
  9. ^ Mayrhofer 1986: 170 ff.
  10. ^ a b Fortson 2009:65
  11. ^ Ringe 2009:9


External links[edit]