Sotho grammar

From Wikipedia, the free encyclopedia
Jump to: navigation, search

  • All examples marked with are included in the audio samples. If a table caption is marked then all Sesotho examples in that table are included in the audio samples.
  • The orthography used in this and related articles is that of South Africa, not Lesotho. For a discussion of the differences between the two see the notes on Sotho orthography.
  • Hovering the mouse cursor over most italic Sesotho text should reveal an IPA pronunciation key (excluding tones). Note that often when a section discusses formatives, affixes, or vowels it may be necessary to view the IPA to see the proper conjunctive word division and vowel qualities.

This article presents a brief overview of the grammar of the Sotho language and provides links to more detailed articles.


The Sotho language may be described in several ways depending on the aspect being considered.

  • It is an agglutinative language. It constructs whole words by joining together discrete roots and morphemes with specific meanings, and may also modify words by similar processes.
  • Its basic word order is SVO. However, because the verb is marked with the subject and sometimes the object, this order may be changed to emphasise certain parts of the predicate.
  • It is a tonal language; more specifically, a complex grammatical tone language. See Sotho tonology.
  • It has no grammatical case marking on the noun. Nominal roles are indicated by a combination of word order and agreement markers on the verb, with no change to the nouns themselves.
  • It has a complex grammatical gender system, but this does not include natural gender. See Sotho nouns.
  • It has head-first order, though it may be changed for emphasis. If an inflected qualificative is placed before the head, then it is technically a qualificative pronoun.
  • It is a pro-drop language. Verbs may be used without explicitly specifying the subject or the object with substantives (nouns or pronouns).


Bantu languages are agglutinative — words are constructed by combining discrete formatives (a.k.a. "morphemes") according to specific rules, and sentences are constructed by stringing together words according to somewhat less strict rules. Formatives alone cannot constitute words; formatives are the component parts of words.

These formatives may be classed generally into roots, stems, prefixes, concords, suffixes, verbal auxiliaries, enclitics, and proclitics.

Roots are the most basic irreducible elements of words and are immutable (except under purely phonetic changes). Entire words are built from roots by affixing other formatives around the root as appendages;[1] every word (except contractions and compounds) contains exactly one root, from which it derives its most basic meaning (though, technically speaking, the root by itself does not really have any meaning). Roots are the basis of the Sotho parts of speech.

The following words:

  1. ho ruta to teach
  2. ba le rutile they taught you (plural)
  3. re a rutana we teach one another
  4. ha ba le rutisise they do not teach you (plural) properly/intensely
  5. morutehi an academic
  6. thuto education
  7. moithuti learner (lit. "one who teaches herself")

are all formed from the root -rut-.

Although in some cases various phonetic processes may ultimately change the root's form in predictable ways (such as the nasalization in the last two examples above) the root itself is considered to be unchanged.

There can be no doubt that words never emerged simply as roots. The root is a dead thing — the study of roots is primarily to aid the compilation of dictionaries, to further the study of comparative Bantu linguistics, and to help trace the evolution and connections of different languages. Many roots are shared by a wide range of Bantu languages.[2]

Some further examples of roots:

  • -tho (Proto-Bantu *-jîntu) ⇒ motho person (especially a member of a Bantu language speaking culture), botho Ubuntu
  • -itsi (Proto-Bantu *-jîgî) ⇒ metsi water (note the vowel coalescence: class 6 ma- + ime-)
  • -rwa (Proto-Bantu *-tua) ⇒ morwa a Khoisan person, Borwa South
  • -j- (Proto-Bantu *-di-) ⇒ ho ja to eat, dijo food, sejeso a magical poison
  • -holo (Proto-Bantu *-kudu) ⇒ -holo large, boholo size, lekgolo one hundred, moholo an older person, moholwane elder brother
  • -rithimorithi shade/shadow, serithi shadow of a human being (also their spirit, which becomes one of the ancestors when they die, or dignity/reputation; this is a very important concept in African Traditional Religion)
  • -re (Proto-Bantu *-ti) ⇒ ho re to say
  • -dimo (Proto-Bantu *-dîmu) ⇒ Modimo God (traditionally never used in the plural[3]), Badimo Ancestors (does not exist in the singular), Bodimo African Traditional Religion, ledimo cannibal/ogre, Dimo the name of an ogre character found in many tales
  • -edi (Proto-Bantu *-jedî) ⇒ ngwedi moonlight, kgwedi moon/month
  • -ja (Proto-Bantu *-bua) ⇒ ntja dog
  • -hlano (Proto-Bantu *-caanu) ⇒ -hlano five

Note that although it is often true that the common root of a number of words may be defined as having some inherent meaning, very often the connection between words sharing common roots is tentative, and this is further evidence that prefix-less noun roots and stems are ultimately meaningless. Roots from a common source help to connect nouns with certain meanings, and often the class prefixes are merely incidental.

  • bosiu night, and tshiu 24-hour day
  • leloko family/lineage/clan, and moloko generation
  • boroko sleep, and dithoko rheum
  • boko brain matter, and moko bone marrow

Pg. 47 of the 1950 edition of Mabille & Dieterlen's Southern Sotho-English Dictionary, showing terms derived from the verb stem -etsa (do, act, make). Note that this edition of the dictionary uses the Lesotho orthography, modified to remove vowel ambiguity. The dictionary lists close to 70 terms under this single headword. Click on picture to enlarge.

Stems are not much different from roots, and the difference between them is fairly arbitrary. Though all roots are also stems, stems often include derivational suffixes, which roots never include. Additionally, the ending -a is included in the verb stem but not in the root (if it was truly part of the core root then it wouldn't be replaced in verb derivations and conjugations).

For example, from the verb root -rar- one may derive several words, including the following (stems in bold):

ho rara – to entangle, entwine
morara (nom. 3) – (a bunch of) grapes; the grape plant; any vine or climbing plant
lerara (nom. 5) – a single grape; a berry
ho rarabolla – to solve
ho rarahana (ass. vb.) – to be entangled together
ho rarahanela (app. ass. vb.) – to spiral (intransitive)
ho rarana (recip. vb.) – to entangle each other
mararane (nom. rel.) – entangled, complex, intricate
ho rarela (app. vb.) – to twist, wind; (idiomatic) to wander in speech
ho rarolla (rev. vb.) – to untangle; to solve
tharollo (nom. 9; pl. 10 di-) – solution

and these may all be listed under the same headword in a dictionary.

Note how, in the above example, not only do many of the words have slightly unexpected/expanded meanings, but the form ho rarabolla uses an irregular derivation pattern.

Prefixes are affixes attached to the fronts of words (noun class prefixes are called such by convention, even though bare roots are not independent words). These are distinct from concords, since changing the prefix of a word may radically alter its meaning, while changing the concord attached to a stem does not change that stem's meaning.

Ke lenaneo It is a programme

Concords are similar to prefixes in that they appear before the word stem. Verbs and qualificatives used to describe a noun are brought into agreement with that noun by using the appropriate concords.

There are seven basic types of concords in Sesotho. In addition, there are two immutable prefixes used with verbs that function similarly to concords.

Ba tla e rala They shall design it

Suffixes appear at the ends of words. There are numerous suffixes in Sesotho serving varied functions. For example, verbs may be derived from other verbs through the employment of several verbal suffixes. Diminutives, augmentatives, and locatives may all be derived from nouns through the use of several suffixes. Most suffixes, except the noun locative suffix and verb inflexional suffixes, are derivational and create new stems.

Strictly speaking the final vowel -a in verb stems is a suffix, as it is often regularly replaced by other vowels in the derivation and inflexion of verbs and nouns.

Ha a a bua nyeweng She did not speak at the court trial

Verbal auxiliaries are not to be confused with auxiliary verbs or deficient verbs. They may appear as prefixes or as infixes.[4] Basically, all formatives that may be affixed to the verb root, excluding suffixes and the objectival and subjectival concords, are verbal auxiliaries.

These include prefixes such as ha- used to negate verbs, and infixes such as -ka- used to form potential tenses.

The infix -a- used to form the past subjunctive (not to be confused with the infix -a- used to form the present indicative positive and the perfect indicative negative; and also used as a "focus marker") merges with the subjectival concord resulting in what is often termed the "auxiliary concord."

Ke a tla I am coming
Ha ke no tla I shall not come

Infix verbal auxiliaries may be further divided into simple infixes and verbal infixes. The main difference lies in the fact that, when forming the relative construction (participial sub-mood) of a verbal complex employing the infix, the verbal infixes may be detached from the main verb and carry the -ng suffix with the main verb converted to an infinitive object,[5] while a verb using a simple infix has to carry the suffix itself.

Ba ka bona They might see (simple infix used) ⇒ Ba ka bonang Those who might see
Ba tla bona They shall see (verbal infix used) ⇒ Ba tlang ho bona Those who shall see

Enclitics (leaning-on words) are usually suffixed to verbs and convey a definite meaning. They were probably once separate words.

They may be divided into two categories: those that draw forward the stress (as normal suffixes), and those that don't alter the word's stress. The second type may result in words that don't have the stress on the penult (as is usual with Sesotho words).

Ha a sa le yo He is no longer there (stress on the penult)
Thola bo! Please keep quiet! (stress on the antepenultimate syllable)

Proclitics are clitics that appear at the fronts of words. There is only one regular proclitic in Sesotho — le- — which is normally prefixed to nouns, pronouns, qualificatives, and adverbs as a conjunction, to convey the same meaning as English "and" when used between substantives. Some Indo-European languages have a post-clitic with a similar meaning (for example Latin -que[6] and Sanskrit-ca).

It may also be used to express the idea of "together with" and "even."

Ntate le mme My father and mother
Ke kopane le yena I met with her
Le bona ha ba kgolwe Even they do not believe

There are also a number of curious utterances where the proclitic is used to express emphatic negatives.

Le kgale Never (lit. "And a long time")
Le letho Nothing (lit. "And something")
Le ho ka Never (lit. "And to be able")

This is similar to the use of the Latin "et" ("and") to mean "even" or "not", as in the supposed last words of Caesar -- "Et tu, Brute?" meaning "Not (or even) you Brutus?".

The Sesotho word[edit]

The Sotho language is spoken conjunctively yet written disjunctively (that is, the spoken phonological words are not the same as the written orthographical words).[7] In the following discussion, the natural conjunctive word division will be indicated by joining the disjunctive elements with the symbol • in the Sesotho and the English translation.

Batho ba•lelapa la•hae ba•a•mo•ahlola People of•family of•his they•judge•him (His family members judge him)

Certain observations about the Sesotho word (and those of many other Bantu languages in general) may be made:

  • Each word has one part of speech, which can usually be determined from the root. Since Sesotho is predominately prefixing, the root is usually the last morpheme of the word, unless enclitics follow.

Not counting compounds and contractions, the word begins with zero or more proclitics, infixes,[4] and prefixes, followed by a stem, followed by zero or more suffixes (which extend the stem) and enclitics.

For example, in the word Ke•a•le•dumedisa (I•greet•y'all) the stem is the verb stem -dumel(a) (agree) surrounded by the subjectival concord ke- (first person singular), the present definite positive indicative infix marker -a-, the objectival concord -le- (third person plural), and the verb extension -isa (causative, but in this case it gives the idiomatic meaning of "greet").

The phonological interactions can be quite complex:

O•a•mpontsha (He•shows•me) subject concord o- + present indicative positive marker -a- + objectival concord -N- + verb stem -bon(a) (see) + causative extension -isa

Here the formatives are distorted by two instances of nasalization.

  • Each word has one main stressed syllable.

No matter how many prefixes, suffixes, enclitics, and proclitics are appended to the word stem the complete word only has one main stressed syllable. This stress is most prominent on the final word in the sentence or "prosodic phrase."[8]

Audio sample of the examples

Problems playing this file? See media help.
Ha•re•a•kgona ho•mo•eletsa hobane o•ne a•le manganga (We•failed to•advise•him because he•PAST he•COPULATIVE stubborn "he was stubborn")
Re•tla•ya ha o•tjho (We•shall•go if you•

Note the monosyllabic conjunctive ha.

Note that, unlike the Nguni languages, Sesotho does not have rules against juxtaposing strings of vowels:

Audio sample of the example

Problems playing this file? See media help.
Ha•a•a•apara (He•is•not•dressed) although the sequence -a•a- (class 1 negative subjectival concord followed by present definite positive indicative marker) is usually pronounced as a long a with a high falling tone, or simply as a short high tone a.

Certain situations may make the word division complex. This can happen with contractions (especially with deficient verb constructions), and in some complex verb conjugations. In all these situations, however, each proper word has exactly one main stressed syllable.

Parts of speech[edit]

Each complete Sesotho word belongs to some part of speech.

In form, some parts of speech (adjectives, enumeratives, some relatives, and all verbs) are radical stems, which need affixes to form meaningful words; others (possessives and copulatives) are formed from full words by the employment of certain formatives; the rest (nouns, pronouns, adverbs, ideophones, conjunctives, and interjectives) are complete words themselves, which may or may not be modified with affixes to form new words.

The difference between the four types of qualificatives is merely in the concords used to associate them with the noun or pronoun they qualify. Since the simplest copulatives do not use any verbs whatsoever (zero copula), entire predicative sentences in Sesotho may be formed without the use of verbs.


  1. ^ Bantuists do it with multiple appendages.
  2. ^ Including the root *-ntu whence the name "Bantu languages" comes. Current work on Proto-Bantu has it that no true roots began with prenasalized consonants, and that the form of this root was actually *-jîntu, as in *mu-jîntu and *ba-jîntu.
  3. ^ It is interesting to note that although there has historically always been a general belief among Westerners that African religions are polytheist, the plural of this word — medimo — was specifically invented by Christian missionaries to aid in translating the Bible (which regularly speaks of "gods" — a concept foreign to Sesotho ATR). Additionally, the noun is traditionally in class 1, but is used in class 3 by Christians and the Bible. There is, and has never been, any confusion among Basotho that the class 2 Badimo may be the plural of the class 1 Modimo since, in the same way that Modimo was never used in the plural, Badimo is never used in the singular (an ancestor is referred to as "one of the ancestors").
  4. ^ a b The use of this term in Bantu linguistics means "formatives placed in the middle of a word" and not the more common "formatives placed in the middle of a morpheme." Bantu languages, being agglutinative, construct words by placing affixes around a stem, and if an affix is always placed after other affixes but before the stem (such as in the verbal complex) then it is usually called an "infix."
  5. ^ This is exactly the same as the behaviour of deficient verbs, and it is very likely that these infixes are grammaticalized contractions using originally Group VI deficient verbs. Additionally, in the negative (and sometimes in the positive) these infixes change to a form ending in the vowel o, which obviously comes from some coalescence with the vowel o (in the infinitive prefix ho-) and the vowel of the original deficient verb (/ɛ/ or /ɑ/ in the positive, and /ɪ/ in the negative). A possible (pre-contraction and grammaticalization) example would be:
    (pre-)Proto-Sotho–Tswana *kɪt͡ɬɑ  xʊdʒɑ I come to/shall eat, *xɑkɪt͡ɬɪ  xʊdʒɑ I do not come to/shall not eat,
    which in modern Sesotho appear as
    Ke tla ja, and Ha ke tlo ja
  6. ^ Senatus Populusque Romanus.
  7. ^ This is a common situation in many (written) Bantu languages, as their orthographies were invented by Europeans who spoke isolating languages. Notice how the class 10 prefix ho- is written separated from the verb stem (contrary to how the other class prefixes are indicated) because this is how infinitives are indicated in their languages. IsiZulu and other Nguni languages are written conjunctively, primarily due to the efforts of Doke and others. Consider the following example:
    Ke tla o thusa
    I will help you (I•FUT.+VE.INDIC•you•help)
    This would be Ngizakusiza in isiZulu. The English free morphemes may usually be moved around to make valid statements, with some change in meaning:
    Help you I will
    Will I help you(?)
    But this is absolutely impossible to do with the Sesotho bound morphemes.
    *Thusa o ke tla
    *Tla ke o thusa
    When compared with other word division schemes, the orthographies used to write the non-Nguni South African languages are extremely disjunctive, since many Bantu language orthographies at least write the verbal complex (such as the example above) as a single orthographical word, but may write prefixes, concords, and clitics as separate words.
  8. ^ Some researchers completely reject the notion that those Southern Bantu languages claimed to have word stress really do, and instead view it as phrasal stress (that is, the penultimate syllable in the prosodic phrase — not the word — is stressed). Although it is true that in normal speech it is usually the penultimate syllable of the prosodic phrase that is stressed, the existence of words with irregular stress patterns suggests that, in Sesotho at least, it is not entirely incorrect to say that stress is a lexical property of the word itself, not just the phrase, and that the word's inherent stress pattern is most prominent when the word is phrase-final.


  • Anyanwu, R. J. 2001. On the manifestation of stress in African languages. Typology of African prosodic systems workshop. Bielefeld University. May 2001.
  • Coupez, A., Bastin, Y., and Mumba, E. 1998. Reconstructions lexicales bantoues 2 / Bantu lexical reconstructions 2. Tervuren: Musée royal de l’Afrique centrale.
  • Doke, C. M., and Mofokeng, S. M. 1974. Textbook of Southern Sotho Grammar. Cape Town: Longman Southern Africa, 3rd. impression. ISBN 0-582-61700-6.
  • Hyman, L. M. 2003. Segmental phonology. In D. Nurse & G. Philippson (eds.), The Bantu languages, pp. 42–58. London: Routledge/Curzon.