= German orthography =

German orthography is the orthography used in writing the German language, which is largely phonemic. However, it shows many instances of spellings that are historic or analogous to other spellings rather than phonemic. The pronunciation of almost every word can be derived from its spelling once the spelling rules are known, but the opposite is not generally the case.

Today, Standard High German orthography is regulated by the Rat für deutsche Rechtschreibung (Council for German Orthography), composed of representatives from most German-speaking countries.

== Alphabet ==

The modern German alphabet consists of the twenty-six letters of the ISO basic Latin alphabet plus four special letters.

=== Basic alphabet ===
| Letter | Name | Name (IPA) | Spelling Alphabet | |
| A | a | A | //aː// | Anton |
| B | b | Be | //beː// | Berta |
| C | c | Ce | //t͡seː// | Cäsar |
| D | d | De | //deː// | Dora |
| E | e | E | //eː// | Emil |
| F | f | Ef | //ɛf// | Friedrich |
| G | g | Ge | //ɡeː// | Gustav |
| H | h | Ha | //haː// | Heinrich |
| I | i | I | //iː// | Ida |
| J | j | Jott, Je | //jɔt// //jeː// | Julius |
| K | k | Ka | //kaː// | Kaufmann, Konrad |
| L | l | El | //ɛl// | Ludwig |
| M | m | Em | //ɛm// | Martha |
| N | n | En | //ɛn// | Nordpol |
| O | o | O | //oː// | Otto |
| P | p | Pe | //peː// | Paula |
| Q | q | Qu, Que | //kuː// //kveː// | Quelle |
| R | r | Er | //ɛʁ// | Richard |
| S | s | Es | //ɛs// | Samuel, Siegfried |
| T | t | Te | //teː// | Theodor |
| U | u | U | //uː// | Ulrich |
| V | v | Vau | //faʊ̯// | Viktor |
| W | w | We | //veː// | Wilhelm |
| X | x | Ix | //ɪks// | Xanthippe, Xavier |
| Y | y | Ypsilon | //ˈʏpsilɔn// //ʏˈpsiːlɔn// | Ypsilon |
| Z | z | Zett | //t͡sɛt// | Zacharias, Zürich |

=== Special letters ===
German has four special letters; three are vowels accented with an umlaut sign () and one is derived from a ligature of (long s) and (; called Eszett "ess-zed/zee" or scharfes S "sharp s"). They have their own names separate from the letters they are based on.

| Letter | Name | Name (IPA) | Spelling Alphabet | |
| Ä | ä | Ä | //ɛː// | Ärger |
| Ö | ö | Ö | //øː// | Ökonom, Österreich |
| Ü | ü | Ü | //yː// | Übermut, Übel |
| ẞ | ß | Eszett, scharfes S | //ɛsˈt͡sɛt// //ˈʃaʁfəs ɛs// | Eszett, scharfes S |

- Capital ẞ was declared an official letter of the German alphabet on 29 June 2017. Previously represented as .
- Historically, long s (ſ) was used as well, as in English and many other European languages.

While the Council for German Orthography considers distinct letters, disagreement on how to categorize and count them has led to a dispute over the exact number of letters the German alphabet has, the number ranging between 26 (considering special letters as variants of ) and 30 (counting all special letters separately).

== Use of special letters ==
===Umlaut diacritic usage===

The accented letters are used to indicate the presence of umlauts (fronting of back vowels). Before the introduction of the printing press, frontalization was indicated by placing an after the back vowel to be modified, but German printers developed the space-saving typographical convention of replacing the full with a small version placed above the vowel to be modified. In German Kurrent writing, the superscripted was simplified to two vertical dashes (as the Kurrent consists largely of two short vertical strokes), which have further been reduced to dots in both handwriting and German typesetting. Although the two dots of umlaut look like those in the diaeresis (trema), the two have different origins and functions.

When it is not possible to use the umlauts (for example, when using a restricted character set) the characters should be transcribed as respectively, following the earlier postvocalic- convention; simply using the base vowel (e.g. instead of ) would be wrong and misleading. However, such transcription should be avoided if possible, especially with names. Names often exist in different variants, such as Müller and Mueller, and with such transcriptions in use one could not work out the correct spelling of the name.

Automatic back-transcribing is wrong not only for names. Consider, for example, das neue Buch ("the new book"). This should never be changed to das neü Buch, as the second is completely separate from the and does not even belong in the same syllable; neue (/de/) is neu (the root for "new") followed by , an inflection. The word does not exist in German.

Furthermore, in northern and western Germany, there are family names and place names in which lengthens the preceding vowel (by acting as a Dehnungs-e), as in the former Dutch orthography, such as Straelen, which is pronounced with a long , not an . Similar cases are Coesfeld and Bernkastel-Kues.

In proper names and ethnonyms, there may also appear a rare and , which are not letters with an umlaut, but a diaeresis, used as in French and English to distinguish what could be a digraph, for example, in Karaïmen, in Alëuten, in Piëch, in von Loë and Hoëcker (although Hoëcker added the diaeresis himself), and in Niuë. Occasionally, a diaeresis may be used in some well-known names, i.e.: Italiën (usually written as Italien).

Swiss keyboards and typewriters do not allow easy input of uppercase letters with umlauts (nor ) because their positions are taken by the most frequent French diacritics. Uppercase umlauts were dropped because they are less common than lowercase ones (especially in Switzerland). Geographical names in particular are supposed to be written with plus , except Österreich. The omission can cause some inconvenience, since the first letter of every noun is capitalized in German.

Unlike in Hungarian, the exact shape of the umlaut diacritics – especially when handwritten – is not important, because they are the only ones in the language (not counting the tittle on and ). They will be understood whether they look like dots (), acute accents () or vertical bars (). A horizontal bar (macron, ), a breve (), a tiny or , a tilde (), and such variations are often used in stylized writing (e.g. logos). However, the breve – or the ring () – was traditionally used in some scripts to distinguish a from an . In rare cases, the was underlined. The breved was common in some Kurrent-derived handwritings; it was mandatory in Sütterlin.

===Sharp s===

Eszett or scharfes S () represents the "s" sound. In the current orthography, the letter is used only after long vowels and diphthongs. Prior to the German spelling reform of 1996, it was used additionally whenever the letter combination occurred at the end of a syllable or word. It is not used in Switzerland and Liechtenstein.

As derives from a ligature of lowercase letters, it is exclusively used in the middle or at the end of a word. The proper transcription when it cannot be used is ( and in earlier times). This transcription can give rise to ambiguities, albeit rarely; one such case is in Maßen "in moderation" vs. in Massen "en masse". In all-caps, is replaced by or, optionally, by the uppercase . The uppercase was included in Unicode 5.1 as U+1E9E in 2008. Since 2010 its use is mandatory in official documentation in Germany when writing geographical names in all-caps. The option of using the uppercase in all-caps was officially added to the German orthography in 2017.

==Sorting==
There are three ways to deal with the umlauts in alphabetic sorting.
1. Treat them like their base characters, as if the umlaut were not present (DIN 5007-1, section 6.1.1.4.1). This is the preferred method for dictionaries, where umlauted words (Füße "feet") should appear near their origin words (Fuß "foot"). In words which are the same except for one having an umlaut and one its base character (e.g. Müll vs. Mull), the word with the base character gets precedence.
2. Decompose them (invisibly) to vowel plus (DIN 5007-2, section 6.1.1.4.2). This is often preferred for personal and geographical names, wherein the characters are used unsystematically, as in German telephone directories (Müller, A.; Mueller, B.; Müller, C.).
3. They are treated like extra letters either placed
## after their base letters (Austrian phone books have between and etc.) or
## at the end of the alphabet (as in Swedish or in extended ASCII).
Microsoft Windows in German versions offers the choice between the first two variants in its internationalization settings.

A sort of combination of nos. 1 and 2 also exists, in use in a couple of lexica: The umlaut is sorted with the base character, but an in proper names is sorted with the umlaut if it is actually spoken that way (with the umlaut getting immediate precedence). A possible sequence of names then would be Mukovic; Muller; Müller; Mueller; Multmann in this order.

Eszett is sorted as though it were . Occasionally it is treated as , but this is generally considered incorrect. Words distinguished only by vs. are rare. The word with gets precedence, and Geschoß (story of a building; South German pronunciation) would be sorted before Geschoss (projectile).

Accents in French loanwords are always ignored in collation.

In rare contexts (e.g. in older indices) (phonetic value equal to English ) and likewise and are treated as single letters, but the vocalic digraphs (historically ), and the historic never are.

===Personal names with special characters===
German names containing umlauts () and/or are spelled in the correct way in the non-machine-readable zone of the passport, but with and/or in the machine-readable zone, e.g. becomes , becomes , and becomes . The transcription mentioned above is generally used for aircraft tickets et cetera, but sometimes (like in US visas) simple vowels are used (MULLER, GOSSMANN). As a result, passport, visa, and aircraft ticket may display different spellings of the same name. The three possible spelling variants of the same name (e.g. Müller/Mueller/Muller) in different documents sometimes lead to confusion, and the use of two different spellings within the same document may give persons unfamiliar with German orthography the impression that the document is a forgery.

Even before the introduction of the capital , it was recommended to use the minuscule as a capital letter in family names in documents (e.g. HEINZ GROßE, today's spelling: HEINZ GROẞE).

German naming law accepts umlauts and/or in family names as a reason for an official name change. Even a spelling change, e.g. from Müller to Mueller or from Weiß to Weiss is regarded as a name change.

==Features of German spelling==

===Capitalization===
A typical feature of German spelling is the general capitalization of nouns and of most nominalized words. In addition, capital letters are used: at the beginning of sentences (may be used after a colon, when the part of a sentence after the colon can be treated as a sentence); in the formal pronoun Sie 'you' and the determiner Ihr 'your' (optionally in other second-person pronouns in letters); in adjectives at the beginning of proper names (e.g. der Stille Ozean 'the Pacific Ocean'); in adjectives with the suffix '-er' from geographical names (e.g. Berliner); in adjectives with the suffix '-sch' from proper names if written with the apostrophe before the suffix (e.g. Ohm'sches Gesetz 'Ohm's law', also written ohmsches Gesetz).

===Compound words===
Compound words, including nouns, are usually written together, e.g. Haustür (Haus + Tür; 'house door'), Tischlampe (Tisch + Lampe; 'table lamp'), Kaltwasserhahn (Kalt + Wasser + Hahn; 'cold water tap/faucet). This can lead to long words: the longest word in regular use, Rechtsschutzversicherungsgesellschaften ('legal protection insurance companies'), consists of 39 letters.

====Hyphen in compound words====
Compounds involving letters, abbreviations, or numbers (written in figures, even with added suffixes) are hyphenated: A-Dur 'A major', US-Botschaft 'US embassy', 10-prozentig 'with 10 percent', 10er-Gruppe 'group of ten'. The hyphen is used when adding suffixes to letters: n-te 'nth'. It is used in substantivated compounds such as
Entweder-oder 'alternative' (literally 'either-or'); in phrase-word compounds such as Tag-und-Nacht-Gleiche 'equinox', Auf-die-lange-Bank-Schieben 'postponing' (substantivation of auf die lange Bank schieben 'to postpone'); in compounds of words containing hyphen with other words: A-Dur-Tonleiter 'A major scale'; in coordinated adjectives: deutsch-englisches Wörterbuch 'German-English dictionary'. Compound adjectives meaning colours are written with a hyphen if they mean two colours: rot-braun 'red and brown', but without a hyphen if they mean an intermediate colour: rotbraun 'reddish brown' (from the spelling reform of 1996 to the 2024 revision of the orthographic rules, both variants could be used in both meanings). Optionally the hyphen can be used to emphasize individual components, to clarify the meaning of complicated compounds, to avoid misunderstandings or when three identical letters occur together (in practice, in this case it is mostly used when writing nouns with triple vowels, e.g. See-Elefant 'elephant seal').

The hyphen is used in compounds where the second part or both parts are proper names, e.g. Foto-Hansen 'the photographer Hansen', Müller-Lüdenscheid 'Lüdenscheid, the city of millers', double-barrelled surnames such as Meyer-Schmidt; geographical names such as Baden-Württemberg. Double given names are variously written as Anna-Maria, Anna Maria, Annamaria. Some compound geographical names are written as one word (e.g. Nordkorea 'North Korea') or as two words (e.g. geographical names beginning with Sankt or Bad). The hyphen is not used when compounds with a proper name in the second part are used as common nouns, e.g. Heulsuse 'crybaby'; also in the name of the fountain Gänseliesel. The hyphen is used in words derived from proper names with hyphen, from proper names of more than one word, or from more than one proper name (optional in derivations with the suffix -er from geographical names from more than one word). Optionally the hyphen can be used in compounds where the first part is a proper name. Compounds of the type "geographical name+specification" are written with a hyphen or as two words: München-Ost or München Ost.

===Vowel length===
Even though vowel length is phonemic in German, it is not consistently represented. However, there are different ways of identifying long vowels:

- A vowel in an open syllable (a free vowel) is long, for instance in ge-ben ('to give'), sa-gen ('to say'). The rule is unreliable in given names, cf. Oliver /[ˈɔlivɐ]/.
- It is rare to see a bare used to indicate a long vowel //iː//. It occurs mainly in loanwords, e.g. Krise 'crisis', but also in some native German words, e.g. wir 'we', gib 'give (imperative)'. Mostly, the long vowel //iː// is represented in writing by the digraph , for instance in Liebe ('love'), hier ('here'). This use is a historical spelling based on the Middle High German diphthong //iə// which was monophthongized in Early New High German. It has been generalized to words that etymologically never had that diphthong, for instance viel ('much'), Friede ('peace') (Middle High German vil, vride). Occasionally – typically in word-final position – this digraph represents //iː.ə// as in the plural noun Knie //kniː.ə// ('knees') (cf. singular Knie //kniː//). In the words Viertel (viertel) //ˈfɪrtəl// ('quarter'), vierzehn //ˈfɪʁt͡seːn// ('fourteen'), vierzig //ˈfɪʁt͡sɪç// ('forty'), represents a short vowel, cf. vier //fiːɐ̯// ('four'). In Fraktur, where capital and are identical or near-identical $\mathfrak{J}$, the combinations Ie and Je are confusable; hence is not used at the start of a word, for example Igel ('hedgehog'), Ire ('Irishman').
- A silent indicates the vowel length in certain cases. That derives from an old //x// in some words, for instance sehen ('to see') zehn ('ten'), but in other words it has no etymological justification, for instance gehen ('to go') or mahlen ('to mill'). Occasionally a digraph can be redundantly followed by , either due to analogy, such as sieht ('sees', from sehen) or etymology, such as Vieh ('cattle', MHG vihe), rauh ('rough', pre-1996 spelling, now written rau, MHG ruh).
- The letters are doubled in a few words that have long vowels, for instance Saat ('seed'), See ('sea'/'lake'), Moor ('moor').
- A doubled consonant after a vowel indicates that the vowel is short, while a single consonant often indicates the vowel is long, e.g. Kamm ('comb') has a short vowel //kam//, while kam ('came') has a long vowel //kaːm//. Two consonants are not doubled: , which is replaced by (until the spelling reform of 1996, however, was divided across a line break as ), and , which is replaced by . In loanwords, (which may correspond with in the original spelling) and can occur.
- For different consonants and for sounds represented by more than one letter ( and ) after a vowel, no clear rule can be given, because they can appear after long vowels, yet are not redoubled if belonging to the same stem, e.g. Mond //moːnt// 'moon', Hand //hant// 'hand'. On a stem boundary, reduplication usually takes place, e.g., nimm-t 'takes'; however, in fixed, no longer productive derivatives, this too can be lost, e.g., Geschäft //ɡəˈʃɛft// 'business' despite schaffen 'to get something done'.
- indicates that the preceding vowel is long, e.g. Straße 'street' vs. a short vowel in Masse 'mass' or 'host'/'lot'. In addition to that, texts written before the 1996 spelling reform also use at the ends of words and before consonants, e.g. naß 'wet' and mußte 'had to' (after the reform spelled nass and musste), so vowel length in these positions could not be detected by the , cf. Maß 'measure' and fußte 'was based' (both unaffected by the reform).

===Double or triple consonants===
Even though German does not have phonemic consonant length, there are many instances of doubled or even tripled consonants in the spelling. A single consonant following a checked vowel is doubled if another vowel follows, for instance immer 'always', lassen 'let'. These consonants are analyzed as ambisyllabic because they constitute not only the syllable onset of the second syllable but also the syllable coda of the first syllable, which must not be empty because the syllable nucleus is a checked vowel.

By analogy, if a word has one form with a doubled consonant, all forms of that word are written with a doubled consonant, even if they do not fulfill the conditions for consonant doubling; for instance, rennen 'to run' → er rennt 'he runs'; Küsse 'kisses' → Kuss 'kiss'.

Doubled consonants can occur in composite words when the first part ends in the same consonant the second part starts with, e.g. in the word Schaffell ('sheepskin', composed of Schaf 'sheep' and Fell 'skin, fur, pelt').

Composite words can also have tripled letters. While this is usually a sign that the consonant is actually spoken long, it does not affect the pronunciation per se: the in Sauerstoffflasche ('oxygen bottle', composed of Sauerstoff 'oxygen' and Flasche 'bottle') is exactly as long as the ff in Schaffell. According to the spelling before 1996, the three consonants would be shortened before vowels, but retained before consonants and in hyphenation, so the word Schifffahrt ('navigation, shipping', composed of Schiff 'ship' and Fahrt 'drive, trip, tour') was then written Schiffahrt, whereas Sauerstoffflasche already had a triple . With the aforementioned change in spelling, even a new source of triple consonants , which in pre-1996 spelling could not occur as it was rendered , was introduced, e.g. Mussspiel ('compulsory round' in certain card games, composed of muss 'must' and Spiel 'game').

===Typical letters===
- : This digraph represents the diphthong //aɪ̯//. The spelling goes back to the Middle High German pronunciation of that diphthong, which was /[ei̯]/. The spelling is found in only a very few native words (such as Saite 'string', Waise 'orphan') but is commonly used to romanize //aɪ̯// in foreign loans from languages such as Chinese.
- : This digraph represents the diphthong /[ɔʏ̯]/, which goes back to the Middle High German monophthong represented by . When the sound is created by umlaut of /[aʊ̯]/ (from MHG ), it is spelled .
- : This letter alternates with . For more information, see above.
- : At the beginning of a word or syllable, these digraphs are pronounced /[ʃt, ʃp]/. In the Middle Ages, the sibilant that was inherited from Proto-Germanic was pronounced as an alveolo-palatal consonant or unlike the voiceless alveolar sibilant that had developed in the High German consonant shift. In the Late Middle Ages, certain instances of merged with , but others developed into . The change to was represented in certain spellings such as Schnee 'snow', Kirsche 'cherry' (Middle High German snê, kirse). The digraphs , however, remained unaltered.
- : The letter occurs only in a few native words and then, it represents . That goes back to the 12th and 13th century, when prevocalic was voiced to . The voicing was lost again in the late Middle Ages, but the still remains in certain words such as in Vogel (cf. Scandinavian fugl or English fowl) 'bird' (hence, is sometimes called Vogel-vau), viel 'much'.
- : The letter represents the sound . In the 17th century, the former sound became , but the spelling remained the same. An analogous sound change had happened in late-antique Latin.
- : The letter represents the sound . The sound, a product of the High German consonant shift, has been written with since Old High German in the 8th century.

===Foreign words===
For technical terms, the foreign spelling is often retained such as //f// or //yː// in the word Physik (physics) of Greek origin. For some common affixes however, like -graphie or Photo-, it is allowed to use -grafie or Foto- instead. Both Photographie and Fotografie are correct, but the mixed variants *Fotographie or *Photografie are not.

For other foreign words, both the foreign spelling and a revised German spelling are correct such as
