Proto-Indo-European language

From Wikipedia, the free encyclopedia
Jump to: navigation, search
"PIE" redirects here. For other uses, see PIE (disambiguation).

Proto-Indo-European (PIE) is the linguistic reconstruction of the common ancestor of the Indo-European languages. PIE was the first proposed proto-language to be widely accepted by linguists. Far more work has gone into reconstructing it than any other proto-language, and it is by far the best understood of all proto-languages of its age. During the 19th century, the vast majority of linguistic work was devoted to reconstruction of Proto-Indo-European or its daughter proto-languages such as Proto-Germanic, and most of the current techniques of linguistic reconstruction in historical linguistics (e.g. the comparative method and the method of internal reconstruction) were developed as a result. These methods supply all of the knowledge concerning PIE, since there is no written record of the language.

Scholars estimate that PIE may have been spoken as a single language (before divergence began) around 3500 BCE, though estimates by different authorities can vary by more than a millennium. A number of hypotheses have been proposed for the origin and spread of the language, the most popular among linguists being the Kurgan hypothesis, which postulates an origin in the Pontic–Caspian steppe of Eastern Europe. Features of the culture of the speakers of PIE, known as Proto-Indo-Europeans, have also been reconstructed based on the shared Indo-European vocabulary of the early Indo-European attested languages.

PIE is thought to have had a complex system of morphology that included inflectional suffixes as well as ablaut (vowel alterations, as preserved in English sing, sang, sung). Nouns and verbs had complex systems of declension and conjugation respectively.

In the descriptions below, an asterisk is used to mark reconstructed PIE words, such as *wódr̥ 'water', *ḱwṓ 'dog' (English hound), or *tréyes 'three (masculine)'.

Development of the theory[edit]

Classification of Indo-European languages. Red: Extinct languages. White: categories or unattested proto-languages. Left half: centum languages; right half: satem languages


There is no direct evidence of PIE, and no evidence suggesting it was ever written. Historical linguistics and Indo-European sound laws have been used to reconstruct all PIE sounds and words from later Indo-European languages using the comparative method[1] and internal reconstruction. Many of the words in the modern Indo-European languages seem to have derived from such "protowords" via regular sound changes (e.g., Grimm's law).

As the Proto-Indo-European language broke up, its sound system diverged, according to various sound changes in the daughter languages. Notable among these are Grimm's law and Verner's law in Proto-Germanic, loss of prevocalic *p- in Proto-Celtic, reduction to h of prevocalic *s- in Proto-Greek, Brugmann's law and Bartholomae's law in Proto-Indo-Iranian, Grassmann's law independently in both Proto-Greek and Proto-Indo-Iranian, and Winter's law and Hirt's law in Balto-Slavic.


The idea of a common root language was first postulated in the 1700s century when Sir William Jones, observed similarities between Sanskrit, Ancient Greek, and Latin. In The Sanscrit Language (1786) he suggested that all three languages had a common root, and that indeed they might further all be related, in turn, to Gothic and to the Celtic languages, as well as to Persian. This common source came to be known as Proto-Indo-European.

The classical phase of Indo-European comparative linguistics leads from Franz Bopp's Comparative Grammar (1833) to August Schleicher's 1861 Compendium and up to Karl Brugmann's Grundriß der vergleichenden Grammatik der indogermanischen Sprachen, which was published in the 1880s. Brugmann's Neogrammarian re-evaluation of the field and Ferdinand de Saussure's development of the laryngeal theory may be considered[by whom?] the beginning of "contemporary" Indo-European studies.

By the early 20th century, well-defined descriptions of PIE had been developed that are still accepted today, with some refinements. The largest developments of the 20th century were the discovery of the Anatolian and Tocharian languages and the acceptance of the laryngeal theory. This theory, in its early forms discussed since the 1880s, became mainstream after Jerzy Kuryłowicz's 1927 discovery of the survival of at least some of these hypothetical phonemes in Anatolian. The Anatolian languages have also spurred a major re-evaluation of theories concerning the development of various shared Indo-European language features and the extent to which these features were present in PIE itself. Relationships to other language families, including the Uralic languages, have been proposed but remain controversial.

Julius Pokorny's magisterial Indogermanisches etymologisches Wörterbuch ("Indo-European Etymological Dictionary", 1959) gave a detailed overview of the lexical knowledge accumulated up until that time, but neglected contemporary trends of morphology and phonology (including the laryngeal theory), and largely ignored Anatolian. The generation of Indo-Europeanists active in the last third of the 20th century (such as Calvert Watkins, Jochem Schindler and Helmut Rix) developed a better understanding of morphology and, in the wake of Kuryłowicz's 1956 Apophonie, understanding of the ablaut. From the 1960s, knowledge of Anatolian became certain enough to establish its relationship to PIE; see also Indo-Hittite.

Historical and geographical setting[edit]

There are several competing hypotheses about when, where, and by whom PIE was spoken. In the most popular[2] model, the Kurgan hypothesis, Kurgans from the Pontic–Caspian steppe north of the Black Sea were the original speakers of PIE.[3][4] This theory was first put forward by Marija Gimbutas and is accepted by most linguists.[5] According to the theory PIE became widespread because its speakers, the Kurgans, were able to migrate into a large area of Europe and Asia because of certain cultural advances. These included the domestication of the horse, herding, and the use of wheeled vehicles.

The people of these cultures were nomadic pastoralists, who, according to the model, by the early 3rd millennium BC had expanded throughout the Pontic-Caspian steppe and into Eastern Europe.[6]

Alternative theories include the Anatolian hypothesis,[7] the Armenia hypothesis, and the indigenous Aryans theory.

Mainstream linguistic estimates of the time between PIE and the earliest attested texts (c. nineteenth century BC; see Kültepe texts) range around 1,500 to 2,500 years, with extreme proposals diverging up to another 100% on either side. Historically, some proposed models postulate the major dispersion of branches in:

Renfrew's archaeological hypothesis assumes the Proto-Indo-Europeans brought agriculture to Europe long before the domestication of the horse, and is not accepted by most linguists.[5] The Out-of-India and Northern-European hypotheses are fringe theories past their vogue.


Proto-Indo-European phonology has been reconstructed to a large extent. Some uncertainties still remain, such as the exact nature of the three series of stops, and the exact number and distribution of the vowels. The Proto-Indo-European accent is usually reconstructed today as having had variable lexical stress, which could appear on any syllable and whose position often varied among different members of a paradigm (e.g. between singular and plural of a verbal paradigm, or between nominative/accusative and oblique cases of a nominal paradigm). Stressed syllables received a higher pitch, therefore it is often said that PIE had pitch accent – but this is not to be confused with the other meaning of the term "pitch accent", which refers to one or two syllables per word having one of at least two tones (while the tones of any other syllables are predictable). The location of the stress ("the accent") is closely associated with ablaut variations, especially between normal-grade vowels (*/e/ and */o/) and zero-grade (i.e. lack of a vowel), but not entirely predictable from it. The accent is best preserved in Vedic Sanskrit and (in the case of nouns) Ancient Greek, and indirectly attested in a number of phenomena in other IE languages.

To account for mismatches between the accent of Vedic Sanskrit and Ancient Greek, as well as a few other phenomena, a few historical linguists prefer to reconstruct PIE as a tone language where each morpheme had an inherent tone; the sequence of tones in a word then evolved, according to that hypothesis, into the placement of lexical stress in different ways in different IE branches.



PIE was a fusional language, in which the grammatical relationships between words were signaled through inflectional morphemes (usually endings). The roots of PIE are basic morphemes carrying a lexical meaning. By addition of suffixes they form word stems, and by addition of desinences (usually endings), these form grammatically inflected words (nouns or verbs). PIE roots are understood to be predominantly monosyllabic with a basic shape CvC(C). This basic root shape is often altered by Indo-European ablaut. Roots which appear to be vowel initial are believed by many scholars to have originally begun with a set of consonants, later lost in all but the Anatolian languages, called laryngeals (specified with a subscript number *h₁, *h₂, *h₃, or *H, if ambiguous). Thus a verb form such as the one reflected in Latin agunt, Greek ἄγουσι (ágousi), Sanskrit ajanti would be reconstructed as *h₂eǵ-onti, with the element *h₂eǵ- constituting the root per se.


Main article: Indo-European ablaut

An important component of PIE morphophonology is the variation in vowels commonly termed ablaut, which occurred both within inflectional morphology (different grammatical forms of a noun or verb) and derivational morphology (between, for example, a verb and an associated verbal noun). Ablaut in PIE was closely associated with the position of the accent; for example, the alternation found in Latin est, sunt reflects PIE *h₁és-ti, *h₁s-ónti. However, it is not possible to derive either one directly from the other. The primary ablaut variation was between normal grade or full grade (*/e/ and */o/), lengthened grade (*/ē/ and */ō/), and zero grade (lack of a vowel, which affected nearby sonorant consonants such as l,m,n and r). The normal grade is often characterized as e-grade or o-grade depending on the particular vowel involved. Ablaut occurred both in the root and the ending. Often the zero-grade appears where the word's accent has shifted from the root to one of the affixes.

Originally, all categories were distinguished both by ablaut and different endings, but the loss of endings in some later Indo-European languages has led them to use ablaut alone to distinguish grammatical categories, as in the Modern English words sing, sang, sung, originally reflecting a pre-Proto-Germanic sequence *sengw-, *songw-, *sngw-.


Proto-Indo-European nouns were declined for eight or nine cases (nominative, accusative, genitive, dative, instrumental, ablative, locative, vocative, and possibly a directive or allative).[9] There were three genders: masculine, feminine, and neuter.

There are two major types of declension, thematic and athematic. Thematic nominal stems are formed with a suffixed vowel *-o- (in vocative *-e) and the stem does not undergo ablaut. The athematic stems are more archaic, and they are classified further by their ablaut behaviour (acrostatic, proterokinetic, hysterokinetic and amphikinetic, after the positioning of the early PIE accent in the paradigm).


PIE pronouns are difficult to reconstruct owing to their variety in later languages. This is especially the case for demonstrative pronouns. PIE had personal pronouns in the first and second person, but not the third person, where demonstratives were used instead. The personal pronouns had their own unique forms and endings, and some had two distinct stems; this is most obvious in the first person singular, where the two stems are still preserved in English I and me. According to Beekes,[10] there were also two varieties for the accusative, genitive and dative cases, a stressed and an enclitic form.

Personal pronouns[10]
First person Second person
Singular Plural Singular Plural
Nominative *h₁eǵ(oH/Hom) *wei *tuH *yuH
Accusative *h₁mé, *h₁me *nsmé, *nōs *twé *usmé, *wōs
Genitive *h₁méne, *h₁moi *ns(er)o-, *nos *tewe, *toi *yus(er)o-, *wos
Dative *h₁méǵʰio, *h₁moi *nsmei, *ns *tébʰio, *toi *usmei
Instrumental *h₁moí *nsmoí *toí *usmoí
Ablative *h₁med *nsmed *tued *usmed
Locative *h₁moí *nsmi *toí *usmi

As for demonstratives, Beekes tentatively reconstructs a system with only two pronouns: *so / *seh₂ / *tod "this, that" and *h₁e / *(h₁)ih₂ / *(h₁)id "the (just named)" (anaphoric). He also postulates three adverbial particles *ḱi "here", *h₂en "there" and *h₂eu "away, again", from which demonstratives were constructed in various later languages.[10]


The Indo-European verb system is complex and, like the noun, exhibits a system of ablaut. The most basic categorization for the Indo-European verb was grammatical aspect. Verbs were classed as stative (verbs that depict a state of being), imperfective (verbs depicting ongoing, habitual or repeated action) or perfective (verbs depicting a completed action or actions viewed as an entire process). Verbs have at least four moods (indicative, imperative, subjunctive and optative, as well as possibly the injunctive, reconstructible from Vedic Sanskrit), two voices (active and mediopassive), as well as three persons (first, second and third) and three numbers (singular, dual and plural). Verbs were also marked by a highly developed system of participles, one for each combination of tense and voice, and an assorted array of verbal nouns and adjectival formations.

The following table shows two possible reconstructions of the PIE verb endings. Sihler's reconstruction largely represents the current consensus among Indo-Europeanists, while Beekes' is a radical rethinking of thematic verbs; although not widely accepted, it is included to show an example of more far-reaching recent research.

Sihler (1995)[11] Beekes (1995)[10]
Athematic Thematic Athematic Thematic
Singular 1st *-mi *-oh₂ *-mi *-oH
2nd *-si *-esi *-si *-eh₁i
3rd *-ti *-eti/-ei *-ti *-e
Dual 1st *-wos *-owos *-ues *-oues
2nd *-th₁es *-eth₁es *-tHes/-tHos *-etHes/-etHos
3rd *-tes *-etes *-tes *-etes
Plural 1st *-mos *-omos *-mes *-omom
2nd *-te *-ete *-th₁e *-eth₁e
3rd *-nti *-onti *-nti *-o


The Proto-Indo-European numerals are generally reconstructed as follows:

Sihler[11] Beekes[10]
one *Hoi-no-/*Hoi-wo-/*Hoi-k(ʷ)o-; *sem- *Hoi(H)nos
two *d(u)wo- *duoh₁
three *trei- (full grade) / *tri- (zero grade) *treies
four *kʷetwor- (o-grade) / *kʷetur- (zero grade)
(see also the kʷetwóres rule)
five *penkʷe *penkʷe
six *s(w)eḱs; originally perhaps *weḱs *(s)uéks
seven *septm̥ *séptm
eight *oḱtō, *oḱtou or *h₃eḱtō, *h₃eḱtou *h₃eḱteh₃
nine *(h₁)newn̥ *(h₁)néun
ten *deḱm̥(t) *déḱmt
twenty *wīḱm̥t-; originally perhaps *widḱomt- *duidḱmti
thirty *trīḱomt-; originally perhaps *tridḱomt- *trih₂dḱomth₂
forty *kʷetwr̥̄ḱomt-; originally perhaps *kʷetwr̥dḱomt- *kʷeturdḱomth₂
fifty *penkʷēḱomt-; originally perhaps *penkʷedḱomt- *penkʷedḱomth₂
sixty *s(w)eḱsḱomt-; originally perhaps *weḱsdḱomt- *ueksdḱomth₂
seventy *septm̥̄ḱomt-; originally perhaps *septm̥dḱomt- *septmdḱomth₂
eighty *oḱtō(u)ḱomt-; originally perhaps *h₃eḱto(u)dḱomt- *h₃eḱth₃dḱomth₂
ninety *(h₁)newn̥̄ḱomt-; originally perhaps *h₁newn̥dḱomt- *h₁neundḱomth₂
hundred *ḱm̥tom; originally perhaps *dḱm̥tom *dḱmtóm
thousand *ǵheslo-; *tusdḱomti *ǵʰes-l-

Lehmann[12] believes that the numbers greater than ten were constructed separately in the dialect groups and that *ḱm̥tóm originally meant "a large number" rather than specifically "one hundred".


Many particles could be used both as adverbs and postpositions, like *upo "under, below". The postpositions became prepositions in most daughter languages. Other reconstructible particles include negators (*ne, *mē), conjunctions (*kʷe "and", *wē "or" and others) and an interjection (*wai!, an expression of woe or agony).


The syntax of the older Indo-European languages has been studied in earnest since at least the late nineteenth century, by such scholars as Hermann Hirt and Berthold Delbrück. In the second half of the twentieth century, interest in the topic increased and led to reconstructions of Proto-Indo-European syntax.[13]

Since all the early attested IE languages were inflectional, PIE is thought to have relied largely on morphological markers, rather than word order, to signal syntactic relationships within sentences.[14] Still, a default (unmarked) word order is thought to have existed in PIE. This was reconstructed by Jacob Wackernagel as being subject–verb–object (SVO), based on evidence in Vedic Sanskrit, and the SVO hypothesis still has some adherents, but as of 2015 the "broad consensus" among PIE scholars is that PIE would have been a subject–object–verb (SOV) language.[15]

The SOV default word order with other orders used to express emphasis (e.g., verb–subject–object to emphasize the verb) is attested in Old Indic, Old Iranian, Old Latin and Hittite, while traces of it can be found in the enclitic personal pronouns of the Tocharian languages.[14] A shift from OV to VO order is posited to have occurred in late PIE, since many of the descendant languages have this order: modern Greek, Romance and Albanian prefer SVO, Insular Celtic has VSO as the default order, and even the Anatolian languages show some signs of this word order shift.[16] The inconsistent order preference in Baltic, Slavic and Germanic can be attributed to contact with outside OV languages.[16]

Sample texts[edit]

Since PIE was spoken by a prehistoric society, no genuine sample texts are available, but since the 19th century, modern scholars have made various attempts to compose example texts for purposes of illustration. These texts are educated guesses at best; Calvert Watkins observed in 1969 that in spite of its 150 years' history, comparative linguistics is not in the position to reconstruct a single well-formed sentence in PIE. Because of this and other similar objections based on Pratishakhyas, such texts are of limited use in getting an impression of what a coherent utterance in PIE might have sounded like.

Published PIE sample texts:

Relationships to other language families[edit]

Proposed genetic connections[edit]

Many higher-level relationships between Proto-Indo-European and other language families have been proposed, but these hypothesized connections are highly controversial. A proposal often considered to be the most plausible of these is that of an Indo-Uralic family, encompassing PIE and Uralic. The evidence usually cited in favor of this consists in a number of striking morphological and lexical resemblances. Opponents attribute the lexical resemblances to borrowing from Indo-European into Uralic. Frederik Kortlandt, while advocating a connection, concedes that "the gap between Uralic and Indo-European is huge", while Lyle Campbell denies that such relationship exists.[citation needed].

Other proposals, further back in time (and proportionately less accepted), link Indo-European and Uralic with Altaic and the other language families of northern Eurasia, namely Yukaghir, Korean, Japanese, Chukotko-Kamchatkan, Nivkh, Kartvelian, Ainu, and Eskimo–Aleut, but excluding Yeniseian (the most comprehensive such proposal is Joseph Greenberg's Eurasiatic), or link Indo-European, Uralic, and Altaic to Afroasiatic and Dravidian (the traditional form of the Nostratic hypothesis), and ultimately to a single Proto-Human family.

A more rarely mentioned proposal associates Indo-European with the Northwest Caucasian languages in a group called the "Pontic languages".

Etruscan shows some similarities to Indo-European, such as a genitive in -s. There is no consensus on whether these are due to a genetic relationship, borrowing, chance and sound symbolism, or some combination of these.

Proposed areal connections[edit]

The existence of certain PIE typological features in Northwest Caucasian languages may hint at an early Sprachbund[17] or substratum that reached geographically to the PIE homelands.[18] This same type of languages, featuring complex verbs of which the current Northwest Caucasian languages might have been the sole survivors, was cited by Peter Schrijver to indicate a local lexical and typological reminiscence in western Europe pointing to a possible Neolithic substratum.[19]

Daughter language groupings[edit]

Generally accepted subfamilies (clades)[edit]

Marginally attested languages[edit]

These include languages that do not appear to be members of any of the above families but are so poorly attested that proper classification of them is not possible. Of them, Phrygian is by far the best attested.

All of the above languages except for Lusitanian (which occurs in the area of modern Portugal) occur in or near the Balkan peninsula, and have been collectively termed the "Paleo-Balkan languages". This is a purely geographic grouping and makes no claims about the relatedness of the languages to each other as compared with other Indo-European languages.

Hypothetical clades[edit]

In popular culture[edit]

PIE is used in dialogue between humans and aliens in Ridley Scott's movie Prometheus.[20] In one scene, an android studies Schleicher's fable.

Christopher Tin's song Water Prelude, from The Drop That Contained the Sea, is sung in PIE.

The words and much morphology and word order of the Atlantean language created by Dr. Marc Okrand for Disney's 2001 Atlantis: The Lost Empire is based on PIE.

Michael Z. Williamson's time-travel novel "A long time until now" has the American translator use PIE to create a dictionary to communicate with stone-age people.

On the videogame Far Cry Primal the Wenja, the Udam and the Izila tribes talk in their own dialects based on PIE.

See also[edit]


  1. ^ "linguistics - The comparative method | science". Retrieved 2016-07-27. 
  2. ^ Mallory, J. P. (1991). In Search of the Indo-Europeans. Thames & Hudson. p. 185. ISBN 978-0500276167. 
  3. ^ Anthony, David W. (2007). The horse, the wheel, and language : how bronze-age riders from the Eurasian steppes shaped the modern world (8th reprint. ed.). Princeton, N.J.: Princeton University Press. ISBN 0-691-05887-3. 
  4. ^ Balter, Michael (13 February 2015). "Mysterious Indo-European homeland may have been in the steppes of Ukraine and Russia". © 2015 American Association for the Advancement of Science. Retrieved 2015-02-17. 
  5. ^ a b Anthony, David W; Ringe, Done (2015). "The Indo-European Homeland from Linguistic and Archaeological Perspectives". The Annual Review of linguistics (1): 199–219. 
  6. ^ Gimbutas, Marija (1985). "Primary and Secondary Homeland of the Indo-Europeans: comments on Gamkrelidze-Ivanov articles". Journal of Indo-European Studies (Spring - summer). 
  7. ^ a b Bouckaert, Remco; Lemey, P.; Dunn, M.; Greenhill, S. J.; Alekseyenko, A. V.; Drummond, A. J.; Gray, R. D.; Suchard, M. A.; et al. (24 August 2012), "Mapping the Origins and Expansion of the Indo-European Language Family", Science 337 (6097): 957–960, doi:10.1126/science.1219669, PMC 4112997, PMID 22923579 
  8. ^ Gray, Russell D; Atkinson, Quentin D (27 November 2003), "Language-tree divergence times support the Anatolian theory of Indo-European origin" (PDF), Nature (NZ: Auckland) (426): 435–39, doi:10.1038/nature02029, PMID 14647380 
  9. ^ Fortson, Benjamin (2004). Indo-European language and culture : an introduction. Malden (USA): Blackwell. p. 102. ISBN 1-4051-0316-7. 
  10. ^ a b c d e Beekes, Robert; Gabriner, Paul (1995). Comparative Indo-European linguistics : an introduction. Amsterdam: J. Benjamins Publishing Company. pp. 147, 212–217, 233, 243. ISBN 978-1556195044. 
  11. ^ a b Sihler, Andrew L. (1995). New comparative grammar of Greek and Latin. New York u.a.: Oxford Univ. Press. ISBN 0-19-508345-8. 
  12. ^ Lehmann, Winfried P (1993), Theoretical Bases of Indo-European Linguistics, London: Routledge, pp. 252–55, ISBN 0-415-08201-3 
  13. ^ Kulikov, Leonid; Lavidas, Nikolaos, eds. (2015). "Preface". Proto-Indo-European Syntax and its Development. John Benjamins. 
  14. ^ a b Mallory, J. P.; Adams, Douglas Q., eds. (1997). "Proto-Indo-European". Encyclopedia of Indo-European Culture. Taylor & Francis. p. 463. 
  15. ^ Hock, Hans Henrich (2015). "Proto-Indo-European verb-finality: Reconstruction, typology, validation". In Kulikov, Leonid; Lavidas, Nikolaos. Proto-Indo-European Syntax and its Development. John Benjamins. 
  16. ^ a b Lehmann, Winfred P. (1974). Proto-Indo-European Syntax. University of Texas Press. p. 250. 
  17. ^ Kortlandt, Frederik (1993), General linguistics and Indo-European reconstruction (PDF), NL 
  18. ^ Kortlandt, Frederik (1989), The spread of the Indo-Europeans (PDF), NL 
  19. ^ Peter Schrijver (March 2007), Keltisch en de buren: 9000 jaar taalcontact (PDF) (in Dutch), NL: University of Utrecht 
  20. ^ Language Log » Proto-Indo-European in Prometheus?,, 2012-06-08, retrieved 2013-03-12 

Further reading[edit]

Introductory works[edit]

Major technical handbooks on Proto-Indo-European[edit]

  • Pokorny, Julius (2005) [1948–59], Indogermanisches etymologisches Wörterbuch (5 ed.), Francke, ISBN 3-7720-0947-6 
  • Rix, Helmut (2001), Lexikon der indogermanischen Verben (2 ed.), Dr. Ludwig Reichert Verlag, ISBN 3-89500-219-4 
  • Wodtko, Dagmar; et al. (2008), Nomina im Indogermanischen Lexikon, Heidelberg: Universitätsverlag Winter 
  • Dunkel, George E. (2014), Lexikon der indogermanischen Partikeln und Pronominalstämme, Heidelberg: Universitätsverlag Winter 

Other major technical works on daughter languages[edit]


External links[edit]