= Substratum in Munda languages =

The Munda languages, spoken mainly in eastern and central India (e.g., Santali, Mundari, Ho, Kharia, Sora), are part of the Austroasiatic family, which also includes Khmer and Vietnamese spoken in Southeast Asia. However, Munda languages as well as their reconstructed ancestral predecessor proto-Munda show significant differences from their Southeast Asian relatives, especially in phonology, morphology, and lexical inventories, likely due to prolonged language contact within the South Asian linguistic area. While areal convergences due to influence from Dravidian and Indo-Aryan languages are highly visible and are not the matter of debate, the Munda languages and proto-Munda possess some unusual typological features and lexical roots that cannot be decisively attributed to either Austroasiatic, Indo-Aryan, and Dravidian. Scholars tentatively believe that they formerly belonged to an extinct language or group of languages that were spoken in South Asia before the Munda, Indo-Aryan, and Dravidian migrations. Some linguists propose that the Munda languages may belong to a 'subtype' of peripheral South Asian areal complex which is distinct from the mainstream Indo-Aryan-Dravidian linguistic sphere.

==Evidence==
Works on the reconstruction of proto-Munda language have been pursued by many linguists for decades, such as Pinnow (1959), Zide & Zide (1976), Anderson & Zide (2001), Donegan & Stampe, and Sidwell & Rau (2015, 2019) all reveal the Munda languages are structurally and typologically atypical and divergent, from its ancestral proto-Austroasiatic tongue. In their The Munda Maritime Hypothesis, Sidwell & Rau (2019) hypothesize that after their arrival to the Mahanadi River Delta region of modern day states of Odisha and West Bengal around 2,000 to 1,500 BCE, the proto-Munda tribe might have encountered a local South Asian population who spoke a distinct language(s). The local inhabitants in turn adopted Austroasiatic speech and lexicons of the new immigrants, while retained some of their phonological and lexical traces, leading to the restructuring in significant proportions, including many critical features that Munda inherited from proto-Austroasiatic. The most persuasive pieces of evidence for Sidwell & Rau's argument come from phonology and lexicon of proto-Munda.

===Typological evidence===
| Feature | Proto-Austroasiatic | Proto-Munda | pre-Munda Eastern Indian substratum |
| Consonants | 24 | 21 | ~ fewer 21 |
| Cardinal vowels | 9 | 6 | Few |
| Affixation | Prefixing and Infixing | Prefixing, Infixing, suffixing | Exclusively suffixing |
| Syllable canon | *C(CC)V(C) | *CV(C) | *CVC |
| Initial clusters | Yes, numerous | No, but reemerged in Gtaʔ and Remo | No |
| Root prosody | Monosyllabic and disyllabic iambs | Mono- or disyllabic with bimoraic constraint and L-H patterns | Polysyllabic with unknown prominence |
| Noun incorporation | Yes | Yes | ? |
| Syntax | Head-initial | Transitional | Head-final |
===Lexical evidence===
Reconstruction of proto-Munda agriculture by Zide & Zide (1976) showcased that besides Austroasiatic etymas for rice, millet, and few agricultural tools, proto-Munda speakers apparently borrowed substrate words in mostly local South Asian crops and flora/fauna terms which also lack credible Indo-Aryan/Dravidian/Nihali/Kusunda/Burushaski sources so far.

  - Proto-Munda agricultural and fauna terms**

| Proto-Austroasiatic | unknown substratum/a |

===Evidence from other languages: Kurmali===
Kurmali, an Indo-Aryan Sadani language spoken by Kurmi & Kudmi Mahato peoples mostly in Jharkhand contiguous with Santali and other Munda lects. Although the language is now Indo-Aryan in nature, it displays some distinctive features like lexical items, grammatical markers and categories that are neither available in Indo-Aryan nor Dravidian, nor even in Munda languages. Paudyal & Peterson (2021) demonstrate that Kurmali has been influenced by Santali, but most notably many of Kurmali common words do not seem to be of Indo-Aryan origin but are also apparently neither of Munda or Dravidian origin. Some of them are listed in the paper:

- ankhai ‘very’
- bĩɽa ‘inspect’
- bɔrɔncɔ ‘instead of’
- ɖula ‘cut (esp. trees)’
- gijɔɽek ‘to laugh’
- gucek ‘open’
- jalaĩ ‘for’
- nuɽ- ‘to eat’
- sakar ‘pile of dust’
- sɔ̃ɽgek ‘sleep’
- tuɽek ‘write’
- ʈhomkek ‘to wait’
- thanau- ‘to see’
- usas ‘easy’

Paudyal & Peterson (2021) note that "Although Kurmali is certainly an Indo-Aryan language, the ethnic background of the Kurmi people is currently a matter of intense debate...[]...during the rule of the East India Company (or 'British Raj'), the Kurmi people were classified and listed as tribals (adivasi). However, just a few years after independence in 1950 they were withdrawn from the list of tribes and placed in the category of "non-tribal people"." They describe that "Further possible influence from Santali (or perhaps the unknown language presumed to have once been spoken by the Kurmis) is found in the demonstrative system. Most Indo-Aryan languages, including the other Sadani languages, have a two-way distinction between ‘proximal’ and ‘distal’ in demonstrative forms. However, Kurmali makes a three-way distinction..." The authors of the paper posit that "there are also signs that the ancestors of the present-day Kurmis once spoke a different language which was not Indo-Aryan, Dravidian or Munda."

==Language X hypothesis==
A potential extinct unknown language or group of languages that might have spoken in South Asia prior Indo-Aryan migration have been proposed. Colin Masica could not find etymologies from Indo-European or Dravidian or Munda or as loans from Persian for 31 percent of agricultural and flora terms of Hindi. He proposed an origin in an unknown language "X". Southworth also notes that the flora terms did not come from either Dravidian or Munda. Southworth found only five terms which are shared with Munda, leading to his suggestion that "the presence of other ethnic groups, speaking other languages, must be assumed for the period in question".

==See also==
- Ancient Ancestral South Indians (AASI)
- Peopling of India
- David Reich
- Franklin Southworth
- Harappan language
