= Munda languages =

Munda
- Altname: Mundaic
- Region: Indian subcontinent
- Ethnicity: Munda peoples
- Speakers: 9–11 million
- Date: 2010s
- Familycolor: Austroasiatic
- Child1: North Munda
- Child2: Sora–Gorum
- Child3: Juang
- Child4: Kharia
- Child5: Gutob–Remo
- Child6: Gtaʼ
- Protoname: Proto-Munda
- Iso2: mun
- Iso5: mun
- Glotto: mund1335
- Glottorefname: Mundaic
- Map: Munda languages map.svg
- Mapcaption: Map of areas with significant concentration of Munda speakers

The Munda languages are a group of closely-related languages spoken by about eleven million people in India, Bangladesh and Nepal. Historically, they have been called the Kolarian languages. They constitute a branch of the Austroasiatic language family, which means they are distantly related to languages such as the Mon and Khmer languages, to Vietnamese, as well as to minority languages in Thailand and Laos and the minority Mangic languages of South China. Bhumij, Ho, Mundari, and Santali are notable Munda languages.

The family is generally divided into two branches: North Munda, spoken in the Chota Nagpur Plateau of Jharkhand, Chhattisgarh, Bihar, Odisha and West Bengal, as well as in parts of Bangladesh and Nepal, and South Munda, spoken in central Odisha and along the border between Andhra Pradesh and Odisha.

North Munda, of which Santali is the most widely spoken and recognised as an official language in India, has twice as many speakers as South Munda. After Santali, the Mundari and Ho languages rank next in number of speakers, followed by Korku and Sora. The remaining Munda languages are spoken by small isolated groups and are poorly described.

Characteristics of the Munda languages include three numbers (singular, dual and plural), two genders (animate and inanimate), a distinction between inclusive and exclusive first-person plural pronouns, the use of suffixes or auxiliaries to indicate tense, and partial, total, and complex reduplication, as well as switch-reference. The Munda languages are generally synthetic and agglutinating. In Munda sound systems, consonant sequences are infrequent except in the middle of words.

The Munda languages are often interpreted as prime examples of father tongues since the majority of native speakers of the these languages tend to display the Y-chromosome haplogroup of the original linguistic founding population in higher frequencies, and that Y-haplogroup signifies the linguistic origin, rather than based on a maternal haplogroup.

==Origin==
Many linguists suggest that the Proto-Munda language probably split from Proto-Austroasiatic somewhere in Indochina. Studies by Chaubey et al. (2011), Arunkumaret al. (2015), Metspalu
et al. (2018), and Tätte et al. (2019) all show that the Munda branch of the Austroasiatic family was created as the result of a male-biased linguistic intrusion into the Indian subcontinent from Southeast Asia during the Late Neolithic period (Sidwell & Rau 2019 cited Tätte et al. (2019), estimate a date of formation between 3,800 and 2,000 years ago), which carried the paternal lineage O1b1a1a into India from either Meghalaya or the sea. These studies and analyses confirm George van Driem's Munda Father tongue hypothesis. Paul Sidwell (2018) suggests they arrived on the coast of modern-day Odisha about 4000–3500 years ago ( BCE) and spread after the Indo-Aryan migration to the region.

Rau and Sidwell (2019), along with Blench (2019), suggest that Pre-Proto-Munda had arrived in the Mahanadi River Delta around 1500 BCE from Southeast Asia via a maritime route, rather than overland. The Munda languages then subsequently spread up the Mahanadi watershed. 2021 studies suggest that Munda languages impacted Eastern Indo-Aryan languages.

==Classification==
Munda consists of five uncontroversial branches (Korku as an isolate, Remo, Savara, Kherwar, and Kharia-Juang). However, their interrelationship is debated.

===Diffloth (1974)===
The bipartite Diffloth (1974) classification is widely cited:

- Munda
  - North Munda
    - Korku
    - Kherwarian
    - *Kherwari branch: Birjia, Koraku
    - *Mundari branch: Mundari, Bhumij, Asuri, Koda, Ho, Birhor, Kol, Turi
    - *Santal branch: Santali, Mahali
  - South Munda
    - Kharia–Juang: Kharia, Juang
    - Koraput Munda
    - * Remo branch: Gata (Gta), Bondo (Remo), Bodo Gadaba (Gutob)
    - *Savara branch [Sora–Juray–Gorum] : Parengi (Gorum), Sora (Savara), Juray, Lodhi

===Diffloth (2005)===
Diffloth (2005) retains Koraput (rejected by Anderson, below) but abandons South Munda and places Kharia–Juang with the northern languages:

===Anderson (1999)===
Anderson's 1999 proposal is as follows.

- Munda
  - North Munda
    - Korku
    - Kherwarian: Santali, Mundari
  - South Munda (3 branches)
    - Kharia–Juang: Juang, Kharia
    - Sora–Gorum: Sora, Gorum
    - Gutob–Remo–Gtaʔ
    - *Gutob–Remo: Gutob, Remo
    - *Gtaʼ: Plains Gtaʔ, Hill Gtaʔ

However, in 2001, Anderson split Juang and Kharia apart from the Juang-Kharia branch and also excluded Gtaʔ from his former Gutob–Remo–Gtaʔ branch. Thus, his 2001 proposal included five branches for South Munda.

===Anderson (2001)===
Anderson (2001) follows Diffloth (1974) apart from rejecting the validity of Koraput. He proposes instead, on the basis of morphological comparisons, that Proto-South Munda split directly into Diffloth's three daughter groups, Kharia–Juang, Sora–Gorum (Savara), and Gutob–Remo–Gtaʼ (Remo).

His South Munda branch contains the following five branches, but the North Munda branch is the same as those of Diffloth (1974) and Anderson (1999).

- Note: "↔" = shares certain innovative isoglosses (structural, lexical). In Austronesian and Papuan linguistics, this has been called a "linkage" by Malcolm Ross.

===Sidwell (2015)===
Paul Sidwell (2015:197) considers Munda to consist of 6 coordinate branches, and does not accept South Munda as a unified subgroup.

- Munda
  - North Munda
    - Korku
    - Kherwarian (Santali, Munda)
  - Sora–Gorum
  - Juang
  - Kharia
  - Gutob–Remo
  - Gtaʼ

==Phonology==
===Consonants, vowels, and syllable===
The Munda languages share similar sets of phonemes with regional languages in their respective areas. Inherited Austroasiatic "checked" glottalised stop (pre-glottalised articulatory) and nasalised final consonants found in some Munda languages such as Mundari (eg. ub ("hair") is realised as [uˀb̥ᵐ]) and Kharia (eg. oreˀdʒ ("ox") is realised [ɔrɛˀɟ˺ⁿ]) may stand out in South Asia. One key feature in the Munda consonants is the distinction between dental and retroflex stops (/t̪/ vs /ʈ/, /d̪/ vs /ɖ/) in lexical level exists in most Munda languages except Sora and Gorum, reflecting a general characteristic of South Asian (Indosphere) phonology. Because of South Asian areal convergence, Munda languages generally have fewer vowels (between 5 and 10) than their Eastern Austroasiatic relatives. Additionally, Sora has glottalised vowels. Like any other Austroasiatic languages, the Munda languages make extensive uses of diphthongs and triphthongs. Larger vowel sequences can be found, with an extreme example of Santali kɔeaeae meaning ‘he will ask for him’. Most Munda languages have registers but lack tones with an exception of Korku, which has acquired two contrastive tones within the South Asian linguistic area: an unmarked high and a marked low. The general syllable shape is (C)V(C), and the preferred structure for disyllables is CVCV. South Munda displays tendency toward initial clusters, CCVC word shape, diphthong reflexes, with best examples are manifested in the Gtaʔ case.

As stated above, tonogenesis in Korku and continuous CCVC/sesquisyllabic development in Gtaʔ, both of which were unfolded inside the South Asian linguistic area, seem to have nothing related to contact-driven restructuring in the subcontinent. It is also unclear whether they were directly connected to areal convergences in the Eastern Austroasiatic languages. Munda word shape is dictated by a general phonotactical phenomenon called bimoraic constraint, which requires free-standing nominal stems to stay disyllabic or to obtain weight at the stressed syllable; that is, monosyllabic free forms of nouns must be expanded to remain heavy (Anderson & Zide 2001). See #Vocabulary for comparison.

===Munda phonemes===
The following table compiles lists of consonantal and vowel systems of several Munda languages, mainly from , , and many others on International Phonetic Alphabet.

| | Plosives | Retroflex Stops | Affricates | Fricatives | Nasals | Rhotics | Laterals | Glides | Vowels |
| Santali | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | s h | m n ɲ ŋ | r ɽ | l | w j | a i u o e ɔ ɛ ə |
| Ho | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | s h | m n [ɲ] ŋ | r | l | w j | a i u o e (ɔ) (ɛ) |
| Mundari | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | s h | m n ɳ ɲ ŋ | r ɽ | l | w j | a i u o e |
| Keraʔ Mundari | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | s h | m n ɲ ŋ | r ɽ | l | w j | a i u o e |
| Asuri | p b t d k ɡ | ʈ ɖ | t͡ʃ d͡ʒ | s h | m n ŋ | r ɽ | l | w j | a i u o e |
| Kɔɖa | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | ʃ h | m n ŋ | r | l | | a i u ɔ ɛ |
| Turi | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | s h | m n ɳ ɲ ŋ | ɾ ɽ | l | ʋ j | a i u ɔ ɛ ə |
| Korku | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | v s h | m n ɳ ɲ ŋ | r ɽ | l | w j | a i u o e |
| Kharia | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | f v s h | m n ɲ ŋ | r ɽ | l | w j | a i u o e |
| Juang | p b t d k ɡ | ʈ ɖ | t͡ʃ d͡ʒ | s | m n ɳ ɲ ŋ | r | l ɭ | j | a i u o e ɔ |
| Sora | p b t d k ɡ ʔ | | t͡ʃ d͡ʒ | s z | m n ɲ ŋ | r ɽ | l | w j | a i u o e ɔ ɛ ə ɨ |
| Gorum | p b t d k ɡ ʔ | ʈ ɖ | | s z | m n ŋ | r ɽ | l | j | a i u e ɔ |
| Remo | p b t d k ɡ ʔ | ʈ ɖ | t͡s d͡z t͡ʃ d͡ʒ | v s z | m n ɳ ɲ ŋ | r ɽ | l | w j | a i u o e |
| Gutob | p b t d k ɡ ʔ | ʈ ɖ | t͡s d͡z t͡ʃ d͡ʒ | s z h | m n ɲ ŋ | r ɽ | l | j | a i u o e |
| Gtaʔ | p b t d k ɡ ʔ | ʈ ɖ | t͡ʃ d͡ʒ | s h | m n ŋ | r ɽ | l | w j | a i u o e æ/ɛ (ɨ) |

===Word prominence===
 posited overarching assumptions that all Munda languages have completely redesigned their word prosodic structure from proto-Austroasiatic rising intonation, iambic and reduced vowel, sesquisyllabic structure to Indic norms of trochaic, falling rhythm, stable or assimilationist consonants and harmonised vowels. That makes them different from Eastern Austroasiatic languages at almost every level. criticised Donegan & Stampe by pointing out that the overall picture appears much more complicated and diverse and that generalisations of Donegan & Stampe are not supported by the instrumental data of the various Munda languages. describes word-rising contour in monosyllables and second syllable prominence in Kharia content words. Even the presence of clitics and affixes does not drive Kharia word prosodic structure to that of a trochaic and falling system. reports final-syllable stress in all but CVC.CV stems in Mundari. , Horo (2017) and found that the Sora disyllables are always iambic, reduced first syllable vowel space, and second syllable prominence. Even CV.CCə words show final-syllable prominence. note that the Sora vowels of the first syllables are centralised and that vowels in the second syllables are more representative of the canonical vowel space.

 describes about Santali prosody that "stress is always released in the second syllable of the word regardless of whether it is an open or a closed syllable". His analysis was confirmed by , whose acoustic data clearly shows that the second syllable in Santali is always the prominent syllable, with a greater intensity of stress and a rising contour.

 reports that in Korku, the final syllable is heavier than the initial syllable, and within a disyllable, stress is preferentially released at the final syllable. The analyses inferred from databases show that despite exhibiting some variants, most Munda prominence alignments are in line with other Austroasiatic languages, with a predictable final-syllable prominence in a prosodic word. Again, make a claim on rhythmic holism that does not conform with the data presented by individual Munda languages.

==Morphology==
Morphologically, both North and South Munda subgroups mainly focus on the head or the verb and so are primarily head-marking, in contrast to the Indo-European and Dravidian languages, which are mainly dependent-marking. As a result, nominal morphology is less complex than is verbal morphology. Case markers on nominals to show syntactic alignments (nominative-accusative or ergative-absolutive) are largely absent or not systematically developed in the Munda languages except Korku. The relation between subject and object in clause is conveyed mainly through verbal referent indexation and word order. At the clause/sentence level, Munda languages are head-final but internally head-first in referent indexation, compounds, and noun incorporation verb complexes.

Munda head-first, bimoraic constraint-free noun incorporation is also found in Khasian, Nicobaric, and other Mon-Khmer languages. In word derivation, besides their own innovative methods, the Munda languages maintain Austroasiatic methods in forms of reduplication, compounding, and derivational infixation and prefixation.

One unusual characteristic that appears to be pervasive among the Munda languages is lexical flexibility, that is, a large number to almost entire their lexicons are precategorial, i.e. lexically underspecified for categories such as noun, verb, adjectives etc. Thus, these languages may rely more on syntax and affixes/clitics to distinguish parts of speech. Pinnow summarised the issue back in the 1960s,

The following elicited examples from Simdega Kharia illustrate the problem:

The degree of lexical flexibility are extremely prominent in North Munda and Kharia, where the lexicons may only contain an open class of contentives; whereas in other South Munda languages this phenomenon however tends to be much weaker.

===North Munda===
The North Munda subgroup is split between Korku and the 14 Kherwarian languages.

====Kherwarian languages====
Kherwarian is a large language continuum with speakers extending west to east from the Indian states of Uttar Pradesh to Assam, north to south from Nepal to Odisha. They include fourteen languages: Asuri, Birhor, Bhumij, Koda, Ho, Korwa (Korowa), Mundari, Mahali, Santali, Turi, Agariya, Bijori, Koraku, and Karmali, with the total number of speakers surpassing ten million (2011 census). The Kherwarian languages are often highlighted because their elaborate and complex templatic and pronominalised predicate structures are so pervasive that it is obligatory for the verb to encode tense–aspect–mood, valency, voices, possessive, transitivity, clear distinction between exclusive and inclusive first-persons, and index with two arguments, including outside arguments like possessors.

| Kherwarian languages | Examples |
| Santali | |
| Mundari | |
| Ho | |
| Asuri | |
| Bhumij | |
| Koda | |
| Korwa | |
| Turi | |
| Birhor | |

Noun incorporation is often described as an ancestral Munda morphological feature and is essential to the grammar of other South Munda languages such as Sora, but the Kherwarian languages appear to have lost noun incorporation altogether. Nevertheless, rare instances of noun incorporation may be found in some archaic Kherwarian registers and oral literature.

====Korku====
Unlike the Kherwarian languages, with their complex verbal morphology, Korku verbs are moderately simple, with a modest amount of synthesis. Korku lacks person/number indexing of subject(s)/actor (except third persons of locative copulas and nominal predicates in the locative case) and independent present/future tense markers. Korku present/future tenses rely on the finitising suffix -bà. Present or future tense negation can be located in preverbs or postverbs, but past tense negation is marked by the suffix -ᶑùn.

Many Korku auxiliary verbs are borrowed from Indo-Aryan. The auxiliary predicate takes tense–aspect–mood, voice, and finitising suffixes for the verb. An example is ghaʈa-, which means 'to manage to, to find a way to' and serves as the acquisitive.

===South Munda===
Compared to North Munda languages languages, South Munda languages are even more divergent and have fewer shared morphological traits. Even the classification of Munda languages is controversial, and South Munda does not seem to exist as a valid taxon. However, South Munda languages retain many notable characteristics of the original Proto-Munda such as prefix slots and scope-ordering of referent indexation and so they represent the less restructured morphology of Munda and reflect the older Proto-Munda and Proto-Austroasiatic structures.

====Kharia====
In Kharia, subject markers index not only dual/plural exclusive/inclusive but also honorific status. Objects are not marked in the verb but instead by the oblique case: -te.

There is a reduplicated free-standing form of finite verbs that behaves differently from the simple verb stem. In the predicate, reduplicated free-standing form never marks tense–aspect–mood and person. That causes the free-standing form to be used in subordination, an attributive function corresponding more or less to relative clauses. The infinitive verb form is marked by =na. The infinitive can serve also as a nominaliser: jib=na=te ‘touching’.

  - Non-finite class**

| | Simple verb root | Free-standing form |
| live | borol | borol |
| open | ruʔ | ruʔruʔ |
| see | yo | yoyo |

Like in Hindi and Sadani, Kharia has made a calque to form sequential converbs (conjunctive participles) kon (derived from ikon, ‘do’). They denote the completion of an action before another begins.

The negation particle um attaches or fuses person/number/honorific of the subject argument.

====Juang====
Juang exhibits nominative-accusative alignment with unmarked subject/agents and marked objects or patients.

In Juang, a pro-drop language, verbs can index both two core arguments in a transitive predicate, but not frequently. If the arguments are not omitted, referent indexation is largely optional. Juang has a fairly complex tense–aspect–mood system, which is often divided into two sets: I for transitive verbs and II for intransitive verbs. The verb "be" is usually omitted in the present tense and with a predicate adjective in sentences.

There are two types of negation markers. Pronominal negation markers are specific for person/number of subject or object arguments. General negation markers such as -jena make up for the lack of a first-person singular negative. Negatives are ambifixative but usually precede the verb stem. There are double negations: combinations of two negatives. The negated verb may reduplicate itself.

Noun incorporation is fossilised in lexical compounds and words like body parts being combined with the verb "wash". Note that the head precedes the incorporated object, as opposed to the head-final position in normal clauses.

====Gtaʔ-Remo-Gutob====
The southernmost Gtaʔ and Remo-Gutob subgroups of South Munda exhibit significant morphological convergence towards Dravidian languages. Auxiliary verb constructions are heavily employed. Doubly-inflected auxiliary verb constructions are common in Gutob and Gorum, which reflects Dravidian influence. Gtaʔ-Remo-Gutob apparently have either altogether lost or not developed object indexation. Thus, they can only employ the dependent-marking strategy in differential argument marking. Examples of each languages:

1. Remo (Anderson, field notes)

2. Gutob

3. Hill Gtaʔ (Anderson, field notes)

Negation in Gutob is the most complex among the Munda languages. Like for other Munda languages, Gtaʔ-Remo-Gutob have lexical noun incorporation. Gtaʔ retains some instances of unproductive incorporation of body parts to the verb "wash" like Juang, which may fit Mithun (1984)'s type II of incorporation.

====Sora-Gorum====
The Sora-Gorum languages consist of Sora, Gorum, and the lesser-known Juray.They display many features that are considered to be archaic that can be dated to Proto-Munda. For mainstream South Asian languages like Indo-Aryan and Dravidian, the latter are exclusively suffixing, prefixes and infixes are unusual but quite common in Austroasiatic languages, and Sora-Gorum has a prefix domain that can host several pre-stem markers. The indexation paradigm in Sora and Gorum renders the fullest form of Proto-Munda predicate structure and syntax. In practice, Sora is inclined to index only one argument. Within a transitive predicate, the object argument is ranked higher than subject, and pronouns are required.

Gorum:

Sora:

In Sora, noun incorporation is a valency-reducing effort, close to what described by Mithun's type III incorporation. Each noun has a combining form (CF), which is a compact, compressed monosyllabic form of free-standing noun, which has been stripped of its functional morphology (weak suppletion) and does not adhere to bimoraic constraint. Only CFs are allowed to be in compounds with the verb stem. The resulted verb-noun incorporated compound is syntactically distinct from phrases. Unlike North Munda, which restricts it to oral literature, noun incorporation in Sora is in fact pervasive in daily conversations, with all nouns other loanwords having a possible CF, which allows the creatation of sequences of complex verb phrases.

While the most salient effect of object noun incorporation in most polysynthetic languages is the lowering of the scope of the verb and the converting of transitive verbs to intransitive, incorporation of transitive subject/agent is considered atypical and occupies at the lowest position of the hierarchy. That made the incorporation of transitive subjects to have once been considered theoretically impossible by some linguists. Among all languages, there are few exceptional attested cases other than Sora that permit such type of incorporation including some Athabaskan languages like Koyukon and South Slavey.

==Munda lexicon and lexical relation with other Indian language families==
Despite some influence from neighbouring languages, the Munda languages generally maintain a solid Austroasiatic and Munda base vocabulary. The most extreme case is Sora, which has zero foreign phonemes. Agricultural-related words from Proto-Austroasiatic are widely shared (Zide & Zide 1976). Words for domesticated animal and plant species like dog, millet, chicken, goat, pig, rice are shared or semantically alternated. There are even specific terms for husked uncooked rice vs cooked rice vs rice (tree), as well as shared words used in rice production and processing like 'mortar', 'pestle', 'paddy', 'sow', 'grind/ground'. The majority of loan words from Indo-Aryan to Munda are quite recent and mostly came from Hindi. The Southern languages like Gutob have received considerable Dradivian lexical influence. A very small number of lexemes seem to be shared between Munda and Tibeto-Burman, probably reflecting earlier contact between the two groups.

It is clear that hundreds of non-Indo-European words in Vedic Sanskrit that Kuiper (1948) attributed to Munda have been rejected through careful analysis. There is a surprising absence of ancient Sanskrit and medieval Indian borrowings of animal and plant names from Munda. Scholars believe that the Munda tribes typically occupied a marginalised and lowly socioeconomic position in the Hinduized society of Vedic South Asia or did not participate in the Hindu caste system and had barely any contacts with Hindus at all. and Southworth (2005) proposed that the early non-Indo-European words with prefixes k-, ka-, ku-, cər- in Vedic Sanskrit belonged to a hypothetical 'Para-Munda substratum', which they believed to be part of the Harappan language. That would imply that Austroasiatic speakers might have penetrated as far as the Panjab and Afghanistan in the early 2nd millennium BC. However, Osada (2009) refuted Witzel and considered that those words might have been in fact Dravidian compounds.

===Vocabulary===
  - Munda basic words**

| gloss | Santali | Mundari | Ho | Bhumij | Korwa | Korku | Kharia | Juang | Sora | Gorum | Remo | Gutob | Gtaʔ |
| "hand" | ti | tīi | tī | ti | tiʔi: | ʈi | tiʔ | iti | si:ʔ | siʔ | titi | titi | tti, nti |
| "foot" | janga | janga | – | janga | dʒaŋg | nanga | -dʒuŋ | idʒiɲ/ŋ | dʒe:ˀŋ | zḭŋ | tiksuŋ | susuŋ | nco |
| "eye" | mẽ̠t' | med' | meɖ | med | meɖ | med | moˀɖ | ɛmɔɖ | mo:ˀd/mad | maˀd | moʔ | moʔ | mwaʔ |
| "water" | daˀk | da: | daʔ | daʔ | da:ʔ | ɖa | daʔ | dag | da:ʔ | ɖaʔ | dak' | ɖaʔ | nɖiaʔ |
| "child" | hon | hon | hon | hon | hon | kon | konon | kɔn | oˀo:n | aŋon | ɔ̃ʔɔ̃ | oʔn | ūhuŋo |
| "bear" | bana | bana | bana | bana | – | bana | bane/ai | banae | kəmbud | kibud | gibɛ | gubɔn | gbɛ |
| "tiger" | kul | kula: | kula | kula | ku:l | kula | kiɽoʔ | kiɭog | kɨna | kulaʔ | kukusa | gikil, kilɔ | nku |
| "dog" | seta | seta | se:ta | seta | sɛit̪a | sita | soloʔ | selog/ˀk | kənsod | kusɔˀd | gusɔd | gusɔʔ | gsuʔ |

==Distribution==
| Language name | Number of speakers (2011) | Location |
| Korwa | 28,400 | Chhattisgarh, Jharkhand |
| Birjia | 25,000 | Jharkhand, West Bengal |
| Mundari (inc. Bhumij) | 1,600,000 | Jharkhand, Odisha, Bihar |
| Asur | 7,000 | Jharkhand, Chhattisgarh, Odisha |
| Ho | 1,400,000 | Jharkhand, Odisha, West Bengal |
| Birhor | 2,000 | Jharkhand |
| Santali | 7,400,000 | Jharkhand, West Bengal, Odisha, Bihar, Assam, Bangladesh, Nepal |
| Turi | 2,000 | Jharkhand |
| Korku | 727,000 | Madhya Pradesh, Maharashtra |
| Kharia | 298,000 | Odisha, Jharkhand, Chhattisgarh |
| Juang | 30,400 | Odisha |
| Gtaʼ | 4,500 | Odisha |
| Bonda | 9,000 | Odisha |
| Gutob | 10,000 | Odisha, Andhra Pradesh |
| Gorum | 20 | Odisha, Andhra Pradesh |
| Sora | 410,000 | Odisha, Andhra Pradesh |
| Juray | 25,000 | Odisha |
| Lodhi | 25,000 | Odisha, West Bengal |
| Koda | 47,300 | West Bengal, Odisha, Bangladesh |
| Kol | 1,600 | West Bengal, Jharkhand, Bangladesh |

==Reconstruction==

The proto-forms have been reconstructed by Sidwell & Rau (2015: 319, 340–363). Proto-Munda reconstruction has since been revised and improved by Rau (2019).

== Writing systems ==
The following are current used alphabets of Munda languages:

- Mundari Bani (Mundari alphabet)
- Ol Chiki (Santali alphabet)
- Ol Onal (Bhumij alphabet)
- Sorang Sompeng (Sora alphabet)
- Warang Citi (Ho alphabet)

==See also==
- Nihali language
- Munda peoples
