User:Dragonoid76/sandbox/Phonological history of Hindustani

This page contains phonetic transcriptions in the International Phonetic Alphabet (IPA). For an introductory guide on IPA symbols, see Help:IPA. For the distinction between [ ], / / and ⟨ ⟩, see IPA § Brackets and transcription delimiters.

You may need rendering support to display the uncommon Unicode characters in this page correctly.

The inherited, native lexicon of the Hindustani language exhibits a large number of extensive sound changes from its Middle Indo-Aryan and Old Indo-Aryan. Many sound changes are shared in common with other Indo-Aryan languages such as Marathi, Punjabi, and Bengali.

Typologically, Hindustani is from the Western Hindi continuum of languages, which evolved from the Apabhraṃśa (Sanskrit: अपभ्रंश, "corrupted") form of Shauraseni Prakrit, in the Central Zone of Indo-Aryan.^[1] Native words are classified as tadbhava (Sanskrit: तद्भव, "inherited") when they derive from Indo-Aryan ^[2]^[3] and as deśaja (Sanskrit: देशज, "indigenous") when they derive from Non-Indo-Aryan languages—primarily Austroasiatic (Munda) languages, as well as Dravidian and Tibeto-Burman languages.^[2]

Overview of Etymology

The history of Hindustani language is marked by a large number of borrowings at all stages.^[4]^[5] When discussing the phonological history of Hindustani, it becomes relevant to separate the native vocabulary from borrowed. A large number of borrowings into Hindustani occurred after the Middle Indo-Aryan stage, such as tatsama (Sanskrit: तत्सम, literally "same as that") loanwords from Sanskrit and other borrowings from Classical Persian and European languages.^[6] These borrowings often produce doublets in Hindustani; for example, karṇ (कर्ण کرن), a tatsama loanword from Sanskrit, and kān (कान کان), the native word, both meaning "ear". These loanwords are easily separable from the native words.

However, Indo-Aryan languages are marked by both dialectal borrowing and semi-learned adaptations to Sanskrit at all stages.^[7]^[8] These again produce some doublets in Hindustani; for example, makkhan (मक्खन مکھن), likely borrowed from a Northeastern Indo-Aryan language (compare Punjabi makkhaṇ), and mākhan (माखन ماکھن), the native word which has become largely displaced by the aforementioned loanword, both meaning "butter".^[9] Some words partially underwent sound changes, but were later re-adapted to Sanskrit through the process of Sanskritization. For example, one would expect, following regular sound changes, for Sanskrit सूर्य (sūrya, "sun") to produce Hindustani *sūj, but we instead find sūraj (सूरज سورج); the difference can be attributed to the re-introduction of -r- influenced by the Sanskrit form.^[10] As these borrowing and adaptations occurred continuously and as early as the Middle Indo-Aryan stage itself, they can be crucial to determining the date of both the word's borrowing and of the general chronology of sound changes.^[7]

Pre-Classical Sanskrit (ca. 700 BCE)

Hindustani preserves some conservative features lost in Vedic Sanskrit.^[11] For example, the Sanskrit kṣ cluster can arise from a large number of Proto-Indo-European sequences (such as *ks, *tḱ, or *ǵʰs), which have merged in Vedic Sanskrit, but in Middle Indo-Aryan we find that such sequences are partially distinguished (kh- vs ch- vs jh-).^[12]

From Sanskrit through Early Middle Indo-Aryan (ca. third century BCE)

The sound changes are sometimes shared with the attested Pali language and Ashokan Prakrit inscriptions, referred to here as "Early Prakrit".^[7]

Early retroflexion

A dental occasionally cerebralizes to a retroflex stop in the environment of a rhotic. In pre-Vedic, a regular sound change occured in which sibilant + dental stop clusters cerebralized to a retroflex sibilant + retroflex stop, and also the dental nasal /n/ cerebralized to /ɳ/ in the environment of a retroflex. This rule is more a dialectal feature seen particularly in the east (later in the north, and northwest, less so in the west).^[13] Some scholars like Wackernagel argue that the original cases (or borrowings from eastern dialects) with a retroflex stop in the environment of a rhotic, like prati- > paḍi- and mēḍra (already retroflex in Proto-Indo-Aryan *Hmáyẓḍʰram) influence later analogical formation.

It is difficult to pinpoint exactly the conditions in which this change occurs due to a high degree of dialectal borrowing and semi-learned adaptation to Sanskrit. The vocalic liquid ṛ very often cerebralizes a following dental stop (Hindustani saṛak "road" < Sanskrit sṛti). But beyond this, even within related words we can have different outcomes—compare Hindustani ādhā "half" < Sanskrit ardha but Hindustani sāṛhe "and a half" < Sanskrit sārdha. Such alterations often go back to Prakrit (which attests both addha and aḍḍha for "half"). Some cerebralized variants of roots developed unique meanings in Middle Indo-Aryan; for example, paṭhati "studies, reads aloud" (whence Hindustani paṛhnā "to read") comes from Sanskrit pṛth- "to spread", and was reborrowed into Sanskrit.^[14]

Loss of the syllabic liquid

Before Pali, the syllabic liquid -ṛ- was lost. Non-initially, it is replaced by -a-, -i-, or -u-:

Sanskrit gṛha > Pali gaha "house"
Sanskrit mṛga > Pali miga or maga "animal, deer"
Sanskrit pṛccha- "to ask" > Pali and Prakrit puccha- > Hindustani pūchnā "asks"

Initially, in Pali it is still replaced by a-, i-, or u-. In Prakrit, it is typically replaced by ri-.

Sanskrit ṛkṣa > Pali accha, Prakrit riccha > Hindustani rīch "bear"

Any further analysis of this change has to take into account several exceptions due to dialectal borrowing and dissimilation. Tentatively, u is found after or around labials, i is found when a neighboring syllable has an i or a sibilant, and a is found elsewhere. Dialectically, a was more common in the south-east (e.g. Maharashtri), while i was more common in the east and north.^[15]

Classical Sanskrit stress system

The stress system which came to characterize Classical Sanskrit was a MIA innovation at this stage. Briefly, the new stress fell on the first long syllable, up to the fourth from the end, starting with and going backwards from the penult. In other words, it never fell on the final syllable, whereas the Vedic accent frequently did so. Already in Pali, this resulted in a weakening and confusion of the vowel in the post-accentual syllable (Vedic Sanskrit candramā́ḥ > Pali cándimā).

Cluster simplifications

This is the most sweeping change of the pre-Pali era. MIA phonotactics are such that:

Word-initial consonant clusters cannot occur. Only a single consonant may occur.
Word-medially, between two vowels we can only find:
- a single consonant
- a geminate unaspirated stop
- an unaspirated stop + the corresponding aspirated stop
- a nasal + a homorganic stop or non-stop consonant
No word-final consonants are tolerated.
Syllable codas can be at-most two morae.

The changes leading to this system are covered most thoroughly here. What follows is a briefer overview of the rules.

Regarding the assimilations of Old Indo-Aryan consonant conjuncts, the Jayadhavalā (ca. ninth century AD) writes

dīsaṁti doṇṇi vaṇṇā saṁjuttā aha va tiṇṇi cattāri
tāṇaṁ duvvala-lōvaṁ kāūṇa kamō pajuttavvō
"When two, or three or four, consonants appear in combination, elide the weakest one, and continue the process"^[16]

Here, "weakest" refers to sounds of higher sonority, and "elide" generally refers to total assimilation of the weaker sound to the stronger sound. Specifically, the sonority scale of Prakrit is (weakest) y < v < r < l < sibilants and h < nasals < stops (strongest). Consider the case of Sanskrit kartavya "duty". In MIA, -rt- and -vy- clusters are not tolerated so the weaker sounds (r and y) totally assimilate to the stronger sounds (t and v), resulting in Pali kattavva. Other examples include Hindustani dūdh "milk" < Prakrit duddha < Sanskrit dugdha and Hindustani sāt "seven" < Prakrit satta < Sanskrit sapta.

Palatalization and bilabialization

When y comes after a dental or more rarely a bilabial stop, it first palatalizes the stop into a palatal stop. For example, Hindustani sac "truth" < Prakrit sacca < Sanskrit satya and Prakrit accharā < Sanskrit asparā "Apsara".

Similarly, v or m can labialize a dental stop into a bilabial before the total assimilation. This is much rarer and only seen in Hindustani numerals beginning with ba- like bārah "twelve" < Early Prakrit bādasa < Sanskrit dvādaśa and in the pronoun āp "you" < Prakrit appa < Sanskrit ātman "self" and related words.

Sibilants in conjunct

The sibilants are lost when they adjoin stops, but only after transferring aspiration to the adjoining consonant. Sibilants are therefore an important source of aspirate consonants in Prakrit. For example, Hindustani hāth "hand" < Prakrit hattha < Sanskrit hasta. When a sibilant comes after a dental stop, it not only transfers aspiration but like y, it palatalizes the stop. For example, Sanskrit vatsa > Prakrit vaccha > Old Hindi bāchā, Hindustani bachṛā "calf".

The sequence kṣ needs to be treated specially because it arises in Sanskrit from a variety of Proto-Indo-Aryan clusters which are partially distinguished in Middle Indo-Aryan. The usual pre-Hindi Prakrit reflex is k(k)h as expected (Hindustani khet "field" < Sanskrit kṣetra), but we also find j(j)h and c(c)h, and the outcome is also subject to dialectal variation.

The interaction between sibilants and liquids or nasals is more complicated. Old Indo-Aryan did not have aspirated nasals, and not all analyses of Middle Indo-Aryan agree on their phonemicity. Word-medially, where S represents a sibilant, Sm > mh and Sn, Sṇ > ṇh. Sm can also become mbh / ṃbh. Word-initially, and often word-medially, Prakrit is uncomfortable (at least orthographically so) with aspirated nasals and either assimilates the cluster to ss or inserts an epenthetic vowel.

Sanskrit vismara- > Prakrit vissara- > Hindustani bisarnā "to forget", but observe dialectal vacillation at the Prakrit stage between vissara-, vimhara-, viṃbhara-, visumara-, and more
Sanskrit kṛṣṇa > Prakrit kanha > Hindustani kānhā, but observe dialectal vacillation at the Prakrit stage between kanha, kinha, kasina, and kasana
Sanskrit smara- > Prakrit samara-, sumara- > Old/dialectal Hindi sãvarnā "to remember"

Anaptyxis is common in initial Sr or Sl sequences. For example, Sanskrit śloka > Pali siloka "hymn" and Sanskrit śreṣman > Prakrit seṃha, seṃbha, etc. Elsewhere, both ss and ṃs are common outcomes of Sr and Sl (Hindustani ā̃sū "tear" < Prakrit aṃsu < Sanskrit aśru).

Excrescence

Excrescence, or the insertion of a consonant in between two consonants, occurs with the sequences -mr- and -ml-, which first become *-mbr- and -*mbl- before undergoing assimilations to -ṃb-.

Sanskrit āmra > Prakrit aṃba > Old Hindi ā̃b, ām > Hindustani ām "mango"

Anaptyxis

Anaptyxis, or the insertion of a vowel in between two consonants, has already been seen in conjuncts with sibilants. It is otherwise a much rarer strategy of avoiding complex conjuncts. It is sometimes used in words involving the conjuncts -tn- or -dm-.

The vowel that is inserted is a by default, but in case one of the consonants is a labial, it is u. In a few cases, it is i. For example, Sanskrit ratna > Pali ratana "jewel" and Sanskrit kleśa > Prakrit kilesa > Hindustani kiles, kales "grief".

Two-Mora Rule

To fit the two-mora rule, the long vowel is shortened in Sanskrit sequences of a long vowel + a syllable coda. The Sanskrit vowels ai and au are themselves more than two morae in length and are thus always shortened to e and o in Prakrit (cases of ai and au in Hindustani are not reflexes of the Sanskrit equivalent and arose from later sound changes).

When assimilation would produce a sequence of three consonants in the middle of a word, geminates are simplified until there are only two consonants in sequence. Possible derivations could be:

Sanskrit rāṣṭra "land, country" > *rāṭṭhra (assimilation of weaker ṣ to stronger ṭ, triggering aspiration) > *rāṭṭha (assimilation of weaker r to stronger ṭ, but we cannot have three consonants in a row so it is simply deleted) > Pali and Prakrit raṭṭha (shortening of vowel according to two-mora rule).
Sanskrit vyāghra > *vāggha (assimilations) > Pali and Prakrit vaggha (shortening of vowel according to two-mora rule) > Hindustani bāgh "tiger"

Loss of final consonants

In Prakrit, only word-final m and n from Sanskrit are retained as the anusvara ṃ. All other final consonants are simply dropped. It is worth noting that the a-stem nominative masculine ending -aḥ (-as in Vedic and Proto-Indo-Aryan) has the sandhi variant -o in Sanskrit and thus instead becomes -o regularly in Prakrit.

Orthography and other changes

By the start of Middle Indo-Aryan, the three sibilants s [s], ś [ɕ], and ṣ [ʂ] merge to s in the Western and Central dialects ancestral to Hindi. They become ś in Eastern dialects (whence Bengali).

Sanskrit deśa > Prakrit desa > Hindustani des "country"

In addition to the monophthongization of ai > e and au > o discussed above, the sequences aya and ayi all monophthongize to ē, and the sequences ava and ayū merge to ō.

Sanskrit avara "lower" > Pali/Prakrit ora, oraṃ "to this side" > Hindustani or "side"

Before syllabic coda, the vowels e and o (which are always long in Sanskrit) undergo the two-mora rule and become short. Prakrit orthography does not have a way to represent these short vowels, and vacillates between using their short high counterparts (i and u) or writing e and o as usual (sometimes romanized as ĕ and ŏ to indicate the length).^[17]

Initially and when geminated, y strengthens to j. In some cases after a front vowel, intervocalic single y becomes the geminate -jj- too:

Sanskrit yamya- > Prakrit jamma- > Hindustani jamnā "to freeze"
Sanskrit kāleya > Prakrit kāleya, kālĕjja, kālijja > Hindustani kalējā "liver"

As a matter of orthography, pre-consonant homorganic nasals in Prakrit and Pali are typically represented as the anusvara, or ṃ in romanization.

Lastly, some major differences between Early Prakrit and Pali are:

the sequences -ny- and -ṇy- become -ññ- in Pali, but -ṇṇ- in Prakrit. jñ tends to become -ññ- in Pali but is usually -jj- in Prakrit.
-vv- goes on to strengthen to -bb- in Pali, but does not in Prakrit

From Early Prakrit to Middle Prakrit (ca. third century AD)

These changes occur after Pali and Early Prakrit, but before the development of the dramatic regional Prakrits (like Maharashtri Prakrit and Shauraseni Prakrit).

Lenition

The main change of this stage is the loss of most single intervocalic stops through progressive weakening.

First, intervocalic unvoiced and voiced stops merge to voiced stops.

Sanskrit kapha > Pali and Early Prakrit kapha > Later kabha "phlegm"

Then, non-retroflex voiced stops are spirantized: g, gh, j, d, dh, b, bh > ɣ, ɣh, ʒ, ð, ðh, β, βh. This stage is marked in writing by vacillation between a voiced stop, a semivowel, or no consonant. (Sanskrit bhāga "portion" > Prakrit bhāga, bhāa) At this point, intervocalic single retroflex voiced stops probably became flaps (e.g. /ɖ ɖʰ/ > /ɽ ɽʰ/). But this is not distinguished in the script.

Finally, in the west at first but later spreading to most dialects, the intervocalic spirants produced by the above rule (i.e. not including /s/ or /h/) are weakened to weakly-articulated glides /j/ or /w/, or deleted. The glide v is retained when it is between ā̆ vowels, but is otherwise also deleted. This produced a number of vowels in hiatus:

a + i > aï (Sanskrit pratijña > Prakrit païjja > Hindustani paij) and a + u > aü. These vowels are distinct from the Sanskrit overlong vowels ai and au, and are marked in romanization by diaeresis.
a + e or a + u can occasionally contract to aï and aü as well
Otherwise, vowels are allowed in hiatus or with an intervening weak y. A number of other contractions and metatheses may optionally occur at this stage (Sanskrit sthavira > Prakrit ṭhavira, ṭhera "old").

Prakrit reflexes of Sanskrit
Sanskrit	Prakrit	Meaning
nakula	naüla (> Hindustani naulā)	"mongoose"
triloka	tiloa	"three worlds"
tyāga	cāya ~ cāa	"abandonment"
markaṭa	makkaḍa (> Hindustani makṛī)	"spider"
kathayati	kahēi (> Hindustani kahnā)	"says, narrates"
sāgara	sāyara ~ sāara (> Hindustani sāyar)	"sea"

Merging of nasals

Leading up to Prakrit the nasals n and ṇ merge. This is indicated in writing by the retroflex nasal (ṇ). Some Indic linguistics hold that this is partly artificial (and the true pronounciation was merged to the dental nasal) while others hold that the nasal was in fact retroflex, and was redentalized in Central dialects at the Apabhraṃśa stage.^[18] Regardless of this, both Sanskrit n and ṇ ultimately become the dental n in Hindustani.

Pleonastic suffixes

Another change worth noting here that will become more prevalent by late MIA and early NIA is the extension of Old Indo-Aryan nominals and roots with pleonastic suffixes. The consensus, implied by the name, is that these innovative suffixes have no semantic purpose and mainly serve to distinguish homophones (created by the sweeping sound changes of Early Prakrit). Some are recognizeable as the reflexes of Old Indo-Aryan diminutive suffixes.^[19] The most common suffixes are:

Feminine -iā ~ -iyā (< earlier -iga, -ikā < Sanskrit -ikā, feminine diminutive) and masculine -a ~ -ya (< earlier -ga, -ka < Sanskrit -ka, masculine diminutive). The equivalent Sanskrit endings were already common in Old Indo-Aryan as diminutives, but become more general and common at this stage. These become the "marked" declension of nouns in Hindustani and other Indo-Aryan languages.
- Prakrit kappaḍa (< Sanskrit karpaṭa) + -a > kappaḍaa > Hindustani kapṛā "clothing"
- Prakrit kaḍa (< Sanskrit kaṭa "twist of straw") + -iā > kaḍiā > Hindustani kaṛī "chain link"
- Many Sanskrit words were already extended with this suffix. For example, Sanskrit prahelikā > Prakrit paheliā > Hindustani pahelī "riddle, puzzle"
-kka
- Prakrit jhala- "flash" + -kka > Late Prakrit jhalakka- "to burn" > Hindi jhalaknā "to sparkle"
- Prakrit *ḍhola (< Sanskrit ḍhola) + -kka- > *ḍholakka > Hindustani ḍholak "dholak"
-ḍa
- Prakrit dava- (< Sanskrit drava-) + -ḍa- > *davaḍa- > Hindustani dauṛnā "to run"
-illa, -la, -lla, or -ulla (in other Indo-Aryan langauges, these ultimately become tied to the past tense), for which compare the Sanskrit -ila and -ula diminutives.
- Prakrit masa- (< Sanskrit maṣa-) + -lla > *masalla- > Hindustani masalnā "to crush"
- Prakrit pahia "spread" (< Sanskrit prathita) + -illa > *pahilla- > Hindustani phailnā "to spread" (with metathesis)
-ra-, for which compare the Sanskrit -ira diminutives.
- Prakrit paya (< Sanskrit pada) + -ra > *payara ~ *paara > Hindustani pair "foot"
-āve- (< rare Sanskrit -āpaya-) becomes a productive Prakrit causative suffix.
- Prakrit ucca "high" + -āvei > uccāve- "to raise" > Hindustani ucānā "to raise"

These suffixes are very often combined with each other:

Prakrit thova ~ thoa ~ thoga (< Sanskrit stoka "a drop") + -ḍa + -a > Hindustani thoṛā "a little"
Prakrit jaa (< Sanskrit yata "restrained") + -kka + -ḍa > Hindustani jakaṛnā "to tighten"
Prakrit maccha (< Sanskrit matsya) + -l(l)a + -iā > Hindustani machlī "fish"

From Middle Prakrit to Late Prakrit (Apabhraṃśa) (ca. sixth century AD)

Intervocalic single -m- is weakened to -w̃- (a nasalized glide), where the unstable nasal is typically transferred to the preceding vowel.^[8]

Sanskrit grāma > Pali/Prakrit gāma > Apabhraṃśa gā̃wa > Hindustani gā̃v "village"

Following an unstressed syllable, single intervocalic -s- is often reduced to -h-.

Sanskrit catúrdaśa > Prakrit caǘddasa > Apabhraṃśa caǘddaha > Hindustani caudah "fourteen", but not in Sanskrit dáśa > Prakrit dása > Hindustani das "ten"

Final long vowels are shortened: ā ē ō > a i u

Sanskrit sandhyā > Prakrit saṃjhā > Apabhraṃśa saṃjha > Hindustani sā̃jh "evening".

Final vowels in hiatus reduce to long vowels, i.e. -aa > -ā, -iā > -ia > -ī, -ua > -ū. One exception to this is the past participle in -iaa (< Sanskrit -ita + Prakrit -a), which becomes -ā.

Long ū is shortened to u before another vowel. Later, long ī is sometimes also shortened in this environment

Sanskrit bhūta > Prakrit bhūa, bhūaa > Apabhraṃśa huā > Hindi huā "became"^[20]
Sanskrit dīpaka > Prakrit dīvaa > Hindustani dī̆vā, dī̆yā "lamp"

The positional stress accent throughout the nominal and verbal paradigm was mostly regularized in Apabhraṃśa through even more lenitions of the inflectional suffixes in order to force stress back onto the root. The changes relevant to Hindustani are:

Weakening of the singular genitive ending from -assa to -aha or -ahu.
Weakening of the plural genitive ending from -āṇaṃ to -ahuṃ (probably under influence of its singular counterpart or blending from the pronominal locative ending -amhi^[21]), which becomes Hindustani oblique -õ ending.
Weakening of the Prakrit verbal endings -āmi ~ -ēmi (1sg) and -āmo ~ -ēmo (1pl) to -ahuṃ, and -aṃti (3pl) to -ahiṃ. The origin of these endings are uncertain, but are continued by the Hindi verbal suffixes -ū̃ and -ẽ
The Prakrit participle -aṃtaa (< Sanskrit -a(n)ta-) is reduced to -atā by Old Hindi, where the nasal disappears in order to bring stress back to the root.

From Late Prakrit (Apabhraṃśa) to Old Hindi (ca. 13th century AD)

Old Hindi marks the start of the New Indo-Aryan era from the MIA period. Many of these changes start to distinguish Hindi from nearby languages like Marathi, Gujarati, and Punjabi.

Vowel coalescence

The vowels in hiatus produced by the loss of many intervocalic consonants before had started to coalesce into long vowels and new diphthongs by the Apabhraṃśa period, but this process becomes more general by Old Hindi.

Vowels of like quality generally coalesce:

MIA duuṇaa > Old Hindi dūnā "twice"
MIA khaaṇaa > Old Hindi khānā "to eat"

Occasionally, the weak hiatus-filler -y- seemed to possess more reality (only when the aa vowel is stressed), making aa either contract to new diphthong ai or contracting further to e.

MIA maaṇaa > Old Hindi mainā, menā "myna"
MIA kaalaa > Old Hindi kelā "banana"

Similarly, -ava- contracted to either au or further to o.

MIA khavaṇaa > Old Hindi khonā "to lose"
MIA avara > Old Hindi aura "and"

Generally, MIA aï and aü also became the Old Hindi diphthongs ai and au.

MIA païjja > Old Hindi paija "vow"
MIA caükka > Old Hindi cauka "plaza"
MIA naüla > Old Hindi naul, naulā "mongoose" (but also forms like nevalā, nyaulā, etc. probably dialectal borrowings)

In the case of unlike vowels in succession:

If the first is unstressed i or u and the second vowel is stressed, the vowel becomes a new glide.
- MIA pi(v)āsa > Old Hindi pyāsa "thirst"
If the first is stressed ī̆, ū̆, e, or o and the second vowel is a, the a is lost.
- MIA thōa + -ḍa + -a > Old Hindi thōṛā "a little"
- MIA sīala > Old Hindi sīla "cold"

Turner explains the occasional further contraction of ai > e and au > o (at least for Gujarati) in terms of inherited words versus later loanwords: in the former the process has had time to go further. A similar explanation of occasions where -y- possessed more reality could be drawn up to word frequency, dialectal borrowing, and semi-learned borrowings.

Degemination

MIA geminates are degeminated, and the preceding vowel undergoes compensatory lengthening if it is short.

Sanskrit	Prakrit	Old Hindi
sapta	satta	sāta "seven"
dugdha	duddha	dūdha "milk"
nṛtya-	nacca-	nācanā "to dance"

This was sometimes accompanied by spontaneous (and regionally random) nasalization of the vowel. (In some cases this goes back to Prakrit):

Sanskrit	Prakrit	Old Hindi
akṣi	akkhi	ā̃kha "eye"
mudga	mugga	mū̃ga "mung bean"

Notably, this change did not occur in the northwest (whence Punjabi and Sindhi), and many Hindustani words were borrowed at an early point from Northwestern dialects. This gives rise to many doublets, with the native word often becoming only dialectal or archaic:

Prakrit	Hindustani native term	Hindustani borrowed term	Meaning
makkhaṇa	mākhan	makkhan	"butter"
haḍḍa	hāṛ	haḍḍā	"bone"
acchaa	āchā	acchā	"clear, good"
sacca	sāc, sā̃cā	sac, saccā	"true"

Under influence of Sanskrit, ablaut was reintroduced to the verbal system to derive causatives. For example, Old Hindi tapanā "to get warm" and tāpanā "to warm (something) up" are built from the MIA verb tappa- "to get warm".

Nasal lengthening

Generally, vowels followed by MIA anusvara ṃ become nasalized and lengthened.

MIA baṃdha > Old Hindi bā̃dha "bind"
MIA saṃjha > Old Hindi sā̃jha "evening"

Occasionally, the anusvara is retained as a homorganic nasal consonant, likely under borrowing from dialects or semi-learned borrowing from Sanskrit.

MIA aṃdhaa > Old Hindi andhā "blind" (compare Sanskrit andha)

Rhythmic vowel shortening

In a pre-tonic position, heavy/long vowels are shortened.

ā > a
- MIA kappū́ra > *kāpū́ra > Old Hindi kapūr "camphor" (but compare Old Marathi kāpūr)
au, o, ū > u
- Late MIA cōrā́va- > Old Hindi curānā "to steal"
ai, e, ī > i
- Late MIA dĕkkhā́va- > Old Hindi dikhānā "to show"

Pre-tonic nasalized vowels can become short nasal vowels or lose nasalization.

- Old Hindi sā̃pa "snake" + -érā > sãpérā, sapérā "snake-charmer"

This becomes an important facet of the Hindi ablauting system, used to form causatives and derivations from verbs.

Other changes

By late Central Apabhraṃśa, the neuter and masculine genders had merged (but not in the west, as modern Marathi and Gujarati retain 3 genders).^[22]

A number of suffixes were weakened. This finally axed the vestiges of the Old Indo-Aryan declension system and results in a system much more similar to Modern Hindustani.

Before the Old Hindi stage, the direct singular marked ending in Apabhraṃśa final -aü, -ayü (< Prakrit -ago < Pali/Early Prakrit -ako < Sanskrit -akaḥ) simplified to -ā (the modern direct nominal ending)^[23]^[24]
By Late Old Hindi, the final -aha ~ -ahu Apabhraṃśa genitive suffix (< Prakrit -assa < Sanskrit -asya) was completely reduced to -a (it is seen as -aha or -ahi in Old Hindi when emphasized or in some other contexts).^[25]
Attenuation of ultimately all final short vowels to /ǝ/, indicated by final -a (recall that at this point, short vowels in Hindi represent all of Sanskrit final a ā i ī u ū ē ō, other sequences like -asya and -ā̆ḥ). Unlike in Modern Hindustani (where it will ultimately be deleted), this schwa was pronounced weakly in Old Hindi.^[26]
- A number of words are saved from this lenition by semi-learned lengthening of the final vowel. For instance, from Sanskrit guru > Prakrit guru > Old Hindi gura, but also the variant gurū "teacher, guide".

From Old Hindi to Modern Hindi

Schwa deletion

At some point in the Old Hindi stage, unstressed a was reduced to the schwa /ǝ/ and then ə → ∅ / VC_CV. Schwa is also lost at most ends of words, producing word-final consonants, aspirated consonants, and many word-internal clusters. This change is not indicated in the Devanagari script for Hindustani.

Old Hindi rāta "night" > Hindustani rāt "night"

Unstressed (short) vowels are also lost in other positions, particularly initial vowels in words of 3 or more syllables or intertonic short vowels.

Old Hindi aḍhā́ī > Hindustani ḍhāī "two and a half"
Old Hindi sámujhā > Hindustani samjhā "understood"
Old Hindi gadahā > Hindustani gadhā "donkey"

This is the source of schwa ablaut in Hindi. For example, the infinitive utarnā "to descend" has the past participle utrā "descended", where the intertonic vowel in Old Hindi utarā has been lost.

Other changes

During the Old Hindi stage, final unstressed -ai and -au monophthongized to -e and -o, respectively. ^[27] Hence, the general third-person singular ending underwent Sanskrit -ati > Prakrit -adi > Apabhraṃśa -aï > Old Hindi -ai > Hindustani -e, but when it was stressed in the monosyllabic Old Hindi hai, it remains unsimplified in Hindustani hai "is"
The sounds /f, z, ʒ, q, x, ɣ/ are loaned into Hindi-Urdu from Persian, English, and Portuguese.
- In Hindi, /f/ and /z/ are most well-established, while /q, x, ɣ/ are variably (by dialect) assimilated into /k, kʰ, g/, respectively, and /ʒ/ is almost never pronounced and substituted by /ʃ/ or /dʒʰ/.^[28]
Monophthongization of ai to /ɛː/ and au to /ɔː/ in many dialects ^[29]
/pʰ/ is starting to merge into /f/ in a number of Hindustani dialects.

Examples of sound changes

The following table shows a possible sequence of changes for some basic vocabulary items, leading from Sanskrit to Modern Hindustani. All entries are romanized. An empty cell means no change at the given stage for the given item. Only sound changes that had an effect on one or more of the vocabulary items are shown.

	night	two and half	vow	to complete	village	evening	riddle	understands	camphor	gambling	juhi flower	damp	tiger	comes
Romanized Sanskrit	rā́trī	ardhatṛtī́ya	pratíjña	nirvā́haṇa	grā́ma	sándhyā	prahḗlikā	sambúdhyatē	karpū́ra	dyū́ta	yū́thikā	śī́tala	vyā́ghra	ā́payati
Orthographic						sáṃdhyā		saṃbúdhyatē
Early retroflexion		arḍhatṛtī́ya
Loss of ṛ		arḍhatatī́ya
Palatalization (Cy)						sáṃjhyā		saṃbújhyatē		jyū́ta
Initial cluster simplification			patíjña		gā́ma		pahḗlikā			jū́ta			vā́ghra
Conjunct assimilations	rā́ttī	aḍḍhatatī́ya	patíjja	nivvā́haṇa		sáṃjjhā		saṃbújjhatē	kappū́ra				vā́ggha
Two-mora rule	ráttī					sáṃjhā							vággha
Merging of sibilants												sī́tala
y > j											jū́thikā
Monophthongizations														āpḗti
Loss of medio-passive voice								saṃbújjhati
Romanized Ashokan/Early Prakrit	ráttī	aḍḍhatatī́ya	patíjja	nivvā́haṇa, (Pali nibbā́haṇa)	gā́ma	sáṃjhā	pahḗlikā	saṃbújjhati	kappū́ra	jū́ta	jū́thikā	sī́tala	vággha	āpḗti
First intervocalic lenition		aḍḍhadadī́ya	padíjja				pahḗligā	saṃbújjhadi		jū́da	jū́dhigā	sī́dala		ābḗdi
Merging of nasals				ṇivvā́haṇa
Pleonastic suffix additions				ṇivvā́haṇaga						jū́daga
Second intervocalic lenition		aḍḍha(y)aī́a	paḯjja	ṇivvā́haṇa(y)a			pahḗliā	saṃbújjhaï		jū́a(y)a	jū́hiā	sī́ala		āvḗi
Dramatic Prakrit Stage	ráttī	aḍḍha(y)aī́a	paḯjja	ṇivvā́haṇa(y)a	gā́ma	sáṃjhā	pahḗliā	saṃbújjhaï	kappū́ra	jū́a(y)a	jū́hiā	sī́ala	vággha	āvḗi
Lenition of intervocalic -m-					gā̃́va
Final long vowels shortened	rátti					sáṃjha	pahḗlia				jū́hia
Shortening of u before a vowel										júa(y)a
Positional stress regularization		aḍḍhá(y)aia												ā́vaï
Apabhraṃśa Stage	rátti	aḍḍhá(y)aia	paḯjja	ṇivvā́haṇa(y)a	gā̃́va	sáṃjha	pahḗlia	saṃbújjhaï	kappū́ra	júa(y)a	jū́hia	sī́ala	vággha	ā́vaï
Dentalization of ṇ > n and ḷ > l				nivvā́hana(y)a
Coalescence of vowels in hiatus		aḍḍhā́ī	páijja	nivvā́hanā			pahḗlī	saṃbújjhai		júā	jū́hī	sī́la		ā́vai
-vv- > -bb- and initial v- > b-				nibbā́hanā									bággha
De-gemination	rā́ti	āḍhā́ī	páija	nībā́hanā				saṃbū́jhai	kāpū́ra				bā́gha
Nasalized vowel lengthening						sā̃́jha		sā̃bū́jhai
Pre-tonic vowel shortening		aḍhā́ī		nibā́hanā				sãbū́jhai	kapū́ra
Nasal vowel + b > m								samū́jhai
Attenuation of final short vowels to /ǝ/	rā́ta
Intervocalic -v- can shift to -y- or -h-														ā́yai
Analogical or morphological change								sámujhai
Old Hindi	rāta	aḍhāī	paija	nibāhanā	gā̃va	sā̃jha	pahelī	samujhai	kapūra	juā	jūhī	sīla	bāgha	āyai
Final -ai, -au > -e, -o								samujhe						āye
Schwa deletion	rāt		paij		gā̃v	sā̃jh			kapūr			sīl	bāgh
Intertonic or pre-tonic schwa deletion		ḍhāī		nibāhnā				samjhe
Hindustani	rāt	ḍhāī	paij	nibāhnā	gā̃v	sā̃jh	pahelī	samjhe	kapūr	juā	jūhī	sīl	bāgh	āye
Devangari	रात	ढाई	पैज	निबाहना	गाँव	साँझ	पहेली	समझे	कपूर	जुआ	जूही	सील	बाघ	आये, आए
Urdu	رات	ڈھائی	پیج	نباہنا	گاؤں	سانجھ	پہیلی	سمجھے	کپور	جوا	جوہی	سیل	باگھ	آئے
	night	two and half	vow	to complete	village	evening	riddle	understands	camphor	gambling	juhi flower	damp	tiger	comes

References

^ Shapiro (2003), p. 305.
^ ^a ^b Grierson, George (1920). "Indo-Aryan Vernaculars (Continued)". Bulletin of the School of Oriental Studies. 3 (1): 51–85. doi:10.1017/S0041977X00087152. S2CID 161798254. at pp. 67-69.
^ [1]
^ "A Guide to Hindi". BBC - Languages - Hindi. BBC. Retrieved 11 December 2015.
^ Kumar, Nitin (28 June 2011). "Hindi & Its Origin". Hindi Language Blog. Retrieved 11 December 2015.
^ Anwer, Syed Mohammed (November 13, 2011). "Language: Urdu and the borrowed words". dawn.com.
^ ^a ^b ^c Masica, Colin P. (1993). The Indo-Aryan Languages. Cambridge University Press. pp. 430 (Appendix I). ISBN 978-0-521-29944-2. Cite error: The named reference "Masica" was defined multiple times with different content (see the help page).
^ ^a ^b J. Bloch (1970). Formation of the Marathi Language. Motilal Banarsidass. pp. 33, 180. ISBN 978-81-208-2322-8. Cite error: The named reference "Bloch1970" was defined multiple times with different content (see the help page).
^ https://dsal.uchicago.edu/cgi-bin/app/soas_query.py?qs=mrak%E1%B9%A3a%E1%B9%87a&searchhws=yes&matchtype=exact
^ https://dsal.uchicago.edu/cgi-bin/app/mcgregor_query.py?qs=%E0%A4%B8%E0%A5%82%E0%A4%B0%E0%A4%9C&searchhws=yes&matchtype=exact
^ Masica, Colin P. (1991). The Indo-Aryan Languages. p. 156.
^ Kobayashi, Masato (2004). Historical Phonology of Old Indo-Aryan Consonants. Study of Languages and Cultures of Asia and Africa Monograph Series. Vol. 42. pp. 60–65. ISBN 4-87297-894-3.
^ J. Bloch (1970). Formation of the Marathi Language. Motilal Banarsidass. p. 6. ISBN 978-81-208-2322-8.
^ Bloch (1970), p. 129,130.
^ Bloch (1970), p. 48,49.
^ https://prakrit.info/prakrit/grammar.html?r=phonology
^ Suniti Kumar Chatterji (1926). The Origin and Development of the Bengali Language. Calcutta University Press. p. 249.
^ Masica, Colin P. (1991). The Indo-Aryan Languages. p. 182.
^ https://aryaman.io/posts/2022-05-03-kk/#:~:text=After%20the%20fragmentation%20of%20Sanskrit,ikā%2D%20(f.).
^ Mishra, Madhusudan (1992). A Grammar of Apabhraṃśa. Vidhyanidhi Prakashan. p. 12.
^ Oberlies (2005), p. 5.
^ Mishra (1992), p. 17.
^ Jaroslav Strnad (2013). Morphology and syntax of Old Hindī: edition and analysis of one hundred Kabīr vānī poems from Rājasthān. Brill. p. 191.
^ Thomas Oberlies (2005). A Historical Grammar of Hindi. Leykam. p. 5.
^ Jaroslav Strnad (2013). Morphology and syntax of Old Hindī: edition and analysis of one hundred Kabīr vānī poems from Rājasthān. Brill. p. 162.
^ Jaroslav Strnad (2013). Morphology and syntax of Old Hindī: edition and analysis of one hundred Kabīr vānī poems from Rājasthān. Brill. p. 159.
^ Jaroslav Strnad (2013). Morphology and syntax of Old Hindī: edition and analysis of one hundred Kabīr vānī poems from Rājasthān. Brill. p. 384.
^ Shapiro (2003), p. 260.
^ Shapiro (2003), p. 258.

[FOOTNOTEShapiro2003305-1] Shapiro (2003), p. 305.

[Grierson1920-2] Grierson, George (1920). "Indo-Aryan Vernaculars (Continued)". Bulletin of the School of Oriental Studies. 3 (1): 51–85. doi:10.1017/S0041977X00087152. S2CID 161798254. at pp. 67-69.

[3] [1]

[4] "A Guide to Hindi". BBC - Languages - Hindi. BBC. Retrieved 11 December 2015.

[5] Kumar, Nitin (28 June 2011). "Hindi & Its Origin". Hindi Language Blog. Retrieved 11 December 2015.

[dawn.com-6] Anwer, Syed Mohammed (November 13, 2011). "Language: Urdu and the borrowed words". dawn.com.

[Masica-7] Masica, Colin P. (1993). The Indo-Aryan Languages. Cambridge University Press. pp. 430 (Appendix I). ISBN 978-0-521-29944-2. Cite error: The named reference "Masica" was defined multiple times with different content (see the help page).

[Bloch1970-8] J. Bloch (1970). Formation of the Marathi Language. Motilal Banarsidass. pp. 33, 180. ISBN 978-81-208-2322-8. Cite error: The named reference "Bloch1970" was defined multiple times with different content (see the help page).

[9] ttps://dsal.uchicago.edu/cgi-bin/app/soas_query.py?qs=mrak%E1%B9%A3a%E1%B9%87a&searchhws=yes&matchtype=exact

[10] ttps://dsal.uchicago.edu/cgi-bin/app/mcgregor_query.py?qs=%E0%A4%B8%E0%A5%82%E0%A4%B0%E0%A4%9C&searchhws=yes&matchtype=exact

[11] Masica, Colin P. (1991). The Indo-Aryan Languages. p. 156.

[12] Kobayashi, Masato (2004). Historical Phonology of Old Indo-Aryan Consonants. Study of Languages and Cultures of Asia and Africa Monograph Series. Vol. 42. pp. 60–65. ISBN 4-87297-894-3.

[13] J. Bloch (1970). Formation of the Marathi Language. Motilal Banarsidass. p. 6. ISBN 978-81-208-2322-8.

[FOOTNOTEBloch1970129,130-14] Bloch (1970), p. 129,130.

[FOOTNOTEBloch197048,49-15] Bloch (1970), p. 48,49.

[16] ttps://prakrit.info/prakrit/grammar.html?r=phonology

[17] Suniti Kumar Chatterji (1926). The Origin and Development of the Bengali Language. Calcutta University Press. p. 249.

[18] Masica, Colin P. (1991). The Indo-Aryan Languages. p. 182.

[19] ttps://aryaman.io/posts/2022-05-03-kk/#:~:text=After%20the%20fragmentation%20of%20Sanskrit,ikā%2D%20(f.).

[20] Mishra, Madhusudan (1992). A Grammar of Apabhraṃśa. Vidhyanidhi Prakashan. p. 12.

[FOOTNOTEOberlies20055-21] Oberlies (2005), p. 5.

[FOOTNOTEMishra199217-22] Mishra (1992), p. 17.

[23] Jaroslav Strnad (2013). Morphology and syntax of Old Hindī: edition and analysis of one hundred Kabīr vānī poems from Rājasthān. Brill. p. 191.

[24] Thomas Oberlies (2005). A Historical Grammar of Hindi. Leykam. p. 5.

[25] Jaroslav Strnad (2013). Morphology and syntax of Old Hindī: edition and analysis of one hundred Kabīr vānī poems from Rājasthān. Brill. p. 162.

[26] Jaroslav Strnad (2013). Morphology and syntax of Old Hindī: edition and analysis of one hundred Kabīr vānī poems from Rājasthān. Brill. p. 159.

[27] Jaroslav Strnad (2013). Morphology and syntax of Old Hindī: edition and analysis of one hundred Kabīr vānī poems from Rājasthān. Brill. p. 384.

[FOOTNOTEShapiro2003260-28] Shapiro (2003), p. 260.

[FOOTNOTEShapiro2003258-29] Shapiro (2003), p. 258.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]