The Czech orthographic system is diacritic. The caron is added to standard Latin letters for expressing sounds which are foreign to the Latin language (but some digraphs have been kept - ch, dž). The acute accent is used for long vowels.
The Czech orthography is considered the model for many other Slavic languages using the Latin alphabet; the Slovenian and Slovak orthographies as well as Gaj's Latin Alphabet are all based on the Czech.
The Czech alphabet consists of 42 letters (including the digraph Ch, which is considered a single letter in Czech).
|Á á||dlouhé á|
|É é||dlouhé é|
|Ě ě[n 1]||ije,
é s háčkem
|F f[n 2]||ef|
|G g[n 2]||gé|
|Í í||dlouhé í,
dlouhé měkké í
|Ó ó[n 2]||dlouhé ó|
|Ú ú||dlouhé ú,
ú s čárkou
|Ů ů[n 1]||ů s kroužkem|
|W w||dvojité vé|
krátké tvrdé í
|Ý ý||dlouhé ypsilon,
dlouhé tvrdé í
- The letters Ě and Ů are practically never capitalized, because they cannot occur at the beginning of any word. These rather synthetic forms are only used in the small caps writing style, e.g. in newspaper headlines.
- The letters F, G, and Ó, represent sounds, /f/, /ɡ/, and /oː/ which, when not allophones of /v/ and /k/ in the case of the first two, are used almost exclusively in words and names of foreign origin. They are now common enough in the Czech language, however, that few Czechs have problems pronouncing them.
The letters Q and W are used exclusively in foreign words, and are replaced with Kv and V once the word becomes "naturalized"; the digraphs dz and dž are also used mostly for foreign words and do not have a separate place in the alphabet.
Czech orthography is primarily phonemic (rather than phonetic) because an individual grapheme usually corresponds to an individual phoneme (rather than a sound). However, some graphemes and letter groups are remnants of historical phonemes which were used in the past but have since merged with other phonemes. Some changes in the phonology have not been reflected in the orthography.
|ě||/ɛ/, /jɛ/||Marks palatalization of preceding consonant; see usage rules below|
|i||/ɪ/||Palatalizes preceding ⟨d⟩, ⟨t⟩, or ⟨n⟩; see usage rules below|
|í||/iː/||Palatalizes preceding ⟨d⟩, ⟨t⟩, or ⟨n⟩; see usage rules below|
|ó||/oː/||Occurs mostly in words of foreign origin.|
|ú||/uː/||See usage rules below|
|ů||/uː/||See usage rules below|
|y||/ɪ/||See usage rules below|
|ý||/iː/||See usage rules below|
|d||/d/||Represents /ɟ/ before ⟨i í ě⟩; see below|
|f||/f/||Occurs mostly in words of foreign origin.|
|g||/ɡ/||Occurs mostly in words of foreign origin.|
|n||/n/||Represents /ɲ/ before ⟨i í ě⟩; see below|
|t||/t/||Represents /c/ before ⟨i í ě⟩; see below|
|x||/ks/||Occurs mostly in words of foreign origin; pronounced /ɡz/ in words with the prefix 'ex-' before vowels.|
- Unofficial ligatures are sometimes used for the transcription of affricates: /ts/, /dz/, /tʃ/, /dʒ/. The actual IPA version supports using two separate letters which can be joined by a tiebar.
- The "long-leg R" ⟨ɼ⟩ is sometimes used to transcribe voiced ⟨ř⟩ (unofficially). This character was withdrawn from the IPA and replaced by the "lower-case R" with the "up-tack" diacritic mark, which denotes "raised alveolar trill".
All the obstruent consonants are subject to voicing (before voiced obstruents except ⟨v⟩) or devoicing (before voiceless consonants and at the end of words); spelling in these cases is morphophonemic (i.e. the morpheme has the same spelling as before a vowel). An exception is the cluster ⟨sh⟩, in which the /s/ is voiced to /z/ only in Moravian dialects, while in Bohemia the /ɦ/ is devoiced to /x/ instead (e.g. shodit /sxoɟɪt/, in Moravia /zɦoɟɪt/). Devoicing /ɦ/ changes its articulation place: it becomes [x]. After unvoiced consonants ⟨ř⟩ is devoiced. Written voiced / voiceless counterparts are kept according to the etymology of the word, e.g. odpadnout [ˈotpadnoʊ̯t] (to fall away) - od- is a prefix; written /d/ is devoiced here because of the following voiceless /p/.
For historical reasons, the consonant [g] is written k in Czech words like kde ('where', < Proto-Slavic *kъdě) or kdo ('who', < Proto-Slavic *kъto). This is because the letter g was historically used for the consonant [j]. The original Slavic phoneme /g/ changed into /h/ in the Old-Czech period. Thus, /g/ is not a separate phoneme (with a corresponding grapheme) in words of domestic origin; it occurs only in foreign words (e.g. graf, gram, etc.).
- led [ˈlɛt] – ledy [ˈlɛdɪ] (ice – ices)
- let [ˈlɛt] – lety [ˈlɛtɪ] (flight – flights)
"Soft" I and "Hard" Y
The letters i/y and í/ý are pronounced [ɪ] and [iː], respectively. Y was originally pronounced [ɨ] as in contemporary Polish. However, in the 14th century, this difference in standard pronunciation disappeared (it has been preserved in some dialects in Ostrava and its surroundings). In words of domestic origin, "soft" i is written only after "soft" or "ambiguous" consonants while "hard" y follows "hard" or "ambiguous" consonants.
|Soft||ž, š, č, ř, c, j, ď, ť, ň|
|Ambiguous||b, f, l, m, p, s, v, z|
|Hard||h, ch, k, r, d, t, n|
The sounds [ɟɪ/ɟiː, cɪ/ciː, ɲɪ/ɲiː] are written di/dí, ti/tí and ni/ní instead of ďi/ďí, ťi/ťí and ňi/ňí, e.g. in čeština Czech pronunciation: [ˈt͡ʃɛʃcɪna]. The sounds [dɪ/diː, tɪ/tiː, nɪ/niː] are denoted, respectively, by dy/dý, ty/tý, ny/ný.
In words of foreign origin, di, ti, ni are pronounced [dɪ, tɪ, nɪ], that is, similarly to inherited dy, ty, ny, e.g. in diktát, dictation.
Ambiguous consonants can be followed by both i and y. In some cases, they distinguish different meanings of words, e.g. být (to be) vs. bít (to beat), mýt (to wash) vs. mít (to have). At school pupils must memorise word roots and prefixes where y is written; i is written in other cases.
Historically the letter c was hard, but this changed in the 19th century. However, in some words it is still followed by the letter y: tác (plate) – tácy (plates).
This letter can never appear in the initial position, and is pronounced according to the preceding consonant:
- [ɟɛ, cɛ, ɲɛ] are written dě, tě, ně instead of ďe, ťe, ňe (analogous to di, ti, ni).
- Bě, pě, vě, fě are written instead of bje, pje, vje, fje. But in some words (vjezd, entry, drive-in, objem, volume), bje, vje are written because –je- is preceded by the prefixes v- or ob- in such cases.
- [mɲɛ] is written mě instead of mňe, except for morphological reasons in some words (jemný, soft -> jemně, softly). The only exceptions are the homophonous forms of the first person singular personal pronoun, mě (for the genitive and accusative cases) and mně (for the dative and locative) - see Czech declension.
There are two ways in Czech to write long [uː]: ú or ů.
Historically, long <ú> changed into the diphthong <ou> [oʊ] (as also happened in the English Great Vowel Shift with words such as "house"). In 1848 ou at the beginning of word-roots was changed into ú in words like ouřad. Thus, the letter ú is written at the beginning of words and word-roots only: úhel (angle), trojúhelník (triangle), except in loanwords: skútr (scooter).
Long <ó> [oː] changed into the diphthong <uo> [ʊo]. The letter o in the diphthong was sometimes written as a ring above the letter u: ů, e.g. kóň > kuoň > kůň (horse). Later, the pronunciation changed into [uː] (again similar to the shift in English of "moon") but the grapheme <ů> has remained. It is similar to German orthography change from ue into ü. It never occurs at the beginning of words: dům (house), domů (home, homeward).
Agreement between the subject and the predicate
The predicate must be always in accordance with the subject in the sentence - in number and person (personal pronouns), and with past and passive participles also in gender. This grammatical principle affects the orthography (see also "Soft" I and "Hard" Y) – it is especially important for the correct choice and writing of plural endings of the participles.
|masculine animate||pes byl koupen||psi byli koupeni||a dog was bought/dogs were bought|
|masculine inanimate||hrad byl koupen||hrady byly koupeny||a castle was bought/castles were bought|
|feminine||kočka byla koupena||kočky byly koupeny||a cat was bought/cats were bought|
|neuter||město bylo koupeno||města byla koupena||a town was bought/towns were bought|
The mentioned example shows both past (byl, byla ...) and passive (koupen, koupena ...) participles. The accordance in gender takes effect in the past tense and the passive voice, not in the present and future tenses in active voice.
If the complex subject is a combination of nouns of different genders, masculine animate gender is prior to others and the masculine inanimate and feminine genders are prior to the neuter gender.
- muži a ženy byli - men and women were
- kočky a koťata byly - cats and kittens were
- my jsme byli (my = we all/men) vs. my jsme byly (my = we women) - we were
Priority of genders:
- masculine animate > masculine inanimate & feminine > neuter
The use of the full stop (.), the colon (:), the semicolon (;), the question mark (?) and the exclamation mark (!) is similar to their use in other European languages. The full stop is placed after a number if it stands for ordinal numerals (as in German), e.g. 1. den (= první den) – the 1st day.
The comma is used to separate individual parts in complex-compound sentences, lists, isolated parts of sentences, etc. Its use in Czech is different from English. Subordinate (dependent) clauses must be always separated from their principle (independent) clauses, for instance. A comma is not placed before a (and), i (as well as), ani (nor) and nebo (or) when they connect parts of sentences or clauses in copulative conjunctions (on a same level). It must be placed in non-copulative conjunctions (consequence, emphasis, exclusion, etc.). A comma can, however, occur in front of the word a (and) if the former is part of comma-delimited parenthesis: Jakub, můj mladší bratr, a jeho učitel Filip byli příliš zabráni do rozhovoru. Probírali látku, která bude u zkoušky, a též, kdo na ní bude. A comma also separates subordinate conjunctions introduced by compostide conjunctions a proto (and therefore) and a tak (and so).
- otec a matka – father and mother, otec nebo matka – father or mother (coordinate relation – no commas)
- Je to pravda, nebo ne? – Is it true, or not? (exclusion)
- Pršelo, a proto nikdo nepřišel. – It was raining, and this is why nobody came. (consequence)
- Já vím, kdo to je. – I know who he is. Myslím, že se mýlíš. – I think (that) you are wrong. (subordinate relation)
- Jak se máš, Anno? – How are you, Anna? (addressing a person)
- Karel IV., římský císař a český král, založil hrad Karlštejn. – Charles IV, Holy Roman Emperor and Bohemian king, founded the Karlštejn Castle. (comma-delimited parenthesis)
Quotation marks. The first one preceding the quoted text is placed to the bottom line:
- Petr řekl: „Přijdu zítra.“ – Peter said: "I'll come tomorrow."
Other types of quotation marks: ‚‘ »«
Apostrophes are used rarely in Czech. They can denote a missing sound in non-standard speech, but it is optional, e.g. řek' or řek (= řekl, he said).
The first word of every sentence and all proper names are capitalized. Special cases are:
- Respect expression – optional: Ty (you sg.), Tvůj (your sg.), Vy (you pl.), Váš (your pl.); Bůh (God), Mistr (Master), etc.
- Headings – The first word is capitalized.
- Cities, towns and villages – All words are capitalized, except for prepositions: Nové Město nad Metují (New-Town-upon-Metuje).
- Geographical or local names – The first word is capitalized, common names as ulice (street), náměstí (square) or moře (sea) are not capitalized: ulice Svornosti (Concordance Street), Václavské náměstí (Wenceslas Square), Severní moře (North Sea). Since 1993, the initial preposition and the first following word are capitalized: lékárna U Černého orla (Black Eagle Pharmacy).
- Official names of institutions – The first word is capitalized: Městský úřad v Kolíně (The Municipal Office in Kolín) vs. městský úřad (a municipal office).
- Names of nations and nationality nouns are capitalized: Anglie (England), Angličan (Englishman), Německo (Germany), Němec (German). Adjectives derived from geographical names and names of nations, such as anglický (English – adjective) and pražský (Prague – adjective, e.g. pražské metro, Prague subway), are not. Names of languages are not capitalized: angličtina (English language).
- Possessive adjectives derived from proper names are capitalized: Pavlův dům (Paul's house).
There are five periods in the development of the Czech orthographic system:
Primitive orthography. For writing sounds which are foreign to the Latin alphabet, letters with similar sounds were used. The oldest known written notes in Czech originate from the 11th century. The literature was written predominantly in Latin in this period. Unfortunately, it was very ambiguous at times, with c, for example, being used for c, č, and k.
Digraphic orthography. Various digraphs were used for non-Latin sounds. The system was not consistent and it also did not distinguish long and short vowels. It had some features that Polish orthography has kept, such as cz, rz instead of č, ř, but was still crippled by ambiguities, such as spelling both s and š as s/ss, z and ž as z, and sometimes even c and č both as cz, only distinguishing by context. Long vowels such as á were sometimes (but not always) written double as aa. Other features of the day included spelling j as g and v as w, as the early modern Latin alphabet had not by then distinguished j from i or v from u.
Diacritic orthography by Jan Hus. Using diacritics for long vowels ("virgula", an acute, "čárka" in Czech) and "soft" consonants ("punctus rotundus", a dot above a letter, which has survived in Polish ż) was suggested for the first time in "De orthographia Bohemica" around 1406. Diacritics replaced digraphs almost completely. It was also suggested that the Prague dialect should become the standard for the Czech language. Jan Hus is considered to be the author of that work but there is some uncertainty about this.
Brethren orthography. The Bible of Kralice (1579–1593), the first complete Czech translation of the Bible from the original languages by the Czech Brethren, became the model for the literary form of the language. The punctus rotundus was replaced by the caron ("háček"). There were some differences from the current orthography, e.g. the digraph ſſ was used instead of š; ay, ey, au instead of aj, ej, ou; v instead of u (at the beginning of words); w instead of v; g instead of j; and j instead of í (gegj = její, hers). Y was written always after c, s and z (e.g. cizí, foreign, was written cyzý) and the conjunction i (as well as, and) was written y.
Modern orthography. During the period of the Czech National Renaissance (end of the 18th century and the first half of the 19th century), Czech linguists (Josef Dobrovský et al.) codified some reforms in the orthography. These principles have been effective up to the present day. The later reforms in the 20th century mostly referred to introducing loanwords into the Czech language and their adaptation to the Czech orthography.
In computing, several different coding standards have existed for this alphabet, among them:
- ISO 8859-2
- Microsoft Windows code page 1250
- IBM PC code page 852
- Kamenický brothers or KEYBCS2 on early DOS PCs and on Fidonet.
- Czech language
- Czech phonology
- Orthographia bohemica
- Czech declension
- Czech verb
- Czech word order
- International Phonetic Alphabet
- Phonemic orthography
- Non-English usage of quotation marks
- "Přehled kódování češtiny". Cestina.cz. Retrieved 2013-11-19.