English terms with diacritical marks
Some English language terms have letters with diacritical marks. Most of the words are loanwords from French, with others coming from Spanish, Portuguese, German, or other languages. Some are however originally English, or at least their diacritics are.
Proper nouns are not generally counted as English terms except when accepted into the language as an eponym – such as Geiger–Müller tube, or the English terms roentgen after Wilhelm Röntgen, and biro after László Bíró, in which case any diacritical mark is often lost.
- 1 Types of diacritical marks
- 2 Native English words
- 3 Words imported from other languages
- 4 Regional differences
- 5 Names with diacritics
- 6 Typographical limitations
- 7 References
- 8 See also
Types of diacritical marks
Though limited, the following diacritical marks in English may be encountered, particularly for marking in poetry:
- the acute accent (née) and grave accent (English poetry marking, changèd), modifying vowels or marking stresses
- the circumflex (entrepôt), borrowed from French
- the diaeresis (Zoë), indicating a second syllable in two consecutive vowels
- the tittle, the dot found on the regular small i and small j, are removed when another diacritic is required
- the macron (English poetry marking, lēad pronounced 'leed', not 'led'), lengthening vowels, as in Māori; or indicating omitted n or m (in pre-Modern English, both in print and in handwriting).
- the breve (English poetry marking, drŏll pronounced 'drol', not 'drowle'), shortening vowels.
- the umlaut (über), altering Germanic vowels
- the cedilla (soupçon), in French and in Portuguese softening c, indicating 's-' not 'k-' pronunciation
- the tilde (Señor), in Spanish indicating palatalised n (although in Spanish and most source languages, it is not considered a diacritic over the letter n but rather as an integral part of the distinct letter ñ)
- the caron (as in Karel Čapek), often also called the háček in English (adapted from "háček", the Czech name [meaning "little hook"]), as Č/č, Š/š, Ř/ř (only in Czech), Ž/ž broadly turns "c" "s" "r" "z" into English "ch" "sh" "rzh" "zh" sounds respectively, and Ď/ď, Ľ/ľ (only in Slovak), Ň/ň and Ť/ť turn "d" "l" "n" and "t" into palatal "dy" "ly" "ny" and "ty" sounds. In most fonts the caron looks like an apostrophe sitting inside the Slovak capital L, as "Ľ", but in fact is only another form of caron.
- the Polish crossed Ł and nasal ogonek (as in Lech Wałęsa) a "dark L", nearer an English "W", and a nasal "e", nearer English "en" (in Polish called "crossed Ł" and [ɔˈɡɔnɛk], "little tail")
- the Croatian and Serbian crossed Đ (as in Franjo Tuđman or Zoran Đinđić), halfway between D and Dj
- the Maltese crossed Ħ (as in the Ħal- town prefix, Ħal Far Industrial Estate), a hard H
- the Swedish over-ring Å (as in the Åland Islands), the å vowel sound
- the Romanian Ș (as in Chișinău), the voiceless postalveolar fricative
For a more complete list see diacritical marks.
Some sources distinguish "diacritical marks" (marks upon standard letters in the A–Z 26-letter alphabet) from "special characters" (letters not marked but radically modified from the standard 26-letter alphabet) such as Old English and Icelandic eth (Ð, ð) and thorn (uppercase Þ, lowercase þ), and ligatures such as Latin and Anglo-Saxon Æ (minuscule: æ), and German eszett (ß; final -ß, often -ss even in German and always in Swiss-German).
Native English words
In some cases, the diacritic is not borrowed from any foreign language but is purely of English origin. The second of two vowels in a hiatus can be marked with a diaeresis (or "tréma") – as in words such as coöperative, daïs and reëlect – but its use has become less common, sometimes being replaced by the use of a hyphen. It is also sometimes (rarely) used over a single vowel to show that it is pronounced separately (as in Brontë). It is often omitted in printed works because the sign is missing on modern keyboards.
The acute and grave accents are occasionally used in poetry and lyrics: the acute to indicate stress overtly where it might be ambiguous (rébel vs. rebél) or nonstandard for metrical reasons (caléndar); the grave to indicate that an ordinarily silent or elided syllable is pronounced (warnèd, parlìament).
In historical versions of English
The Old English Latin alphabet began to replace the Runic alphabet in the 9th century, due to the influence of Celtic Christian missionaries to the Anglo-Saxon kingdoms. The orthography of Old English – which was entirely handwritten in its own time – was not well standardized, though it did not use all the Latin letters, and included several letters not present in the modern alphabet. When reprinted (on a printing press or computer) in modern times, an overdot is occasionally used with two Latin letters to differentiate sounds for the reader:
- ċ is used for a voiceless palato-alveolar affricate /t͡ʃ/
- ġ for a palatal approximant /j/ (probably a voiced palatal fricative /ʝ/ in the earliest texts)
Some modern printings also apply diacritics to vowels following the rules of Old Norse normalized spelling developed in the 19th century.
In the Late Middle English period, the shape of the English letter þ (thorn), which was derived from the Runic alphabet, evolved in some handwritten and blackletter texts to resemble the Latin letter y. The þ shape survived into the era of printing presses only as far as the press of William Caxton. In later publications, thorn was represented by "y", or by ẏ to distinguish thorn from y. By the end of the Early Modern English period, thorn had been completely replaced[why?] in contemporary usage by the digraph "th" (reviving a practice from early Old English), and the overdot was no longer needed outside of printings of very old texts. The overdot is missing from the only surviving usage of a Y-shaped thorn, in the archaic stock phrase ye olde (from "þe olde", pronounced "the old", but "ye olde" is often misanalyzed and pronounced with the modern "y" sound).
Words imported from other languages
Non-English loanwords enter the English language by a process of naturalisation, or specifically anglicisation, which is carried out mostly unconsciously (a similar process occurs in all other languages). During this process there is a tendency for accents and other diacritics that were present in the donor language to be dropped (for example French hôtel and French rôle becoming "hotel" and "role" respectively in English, or French à propos, which lost both the accent and space to become English "apropos").
In many cases, imported words can be found in print in both their accented and unaccented versions. Since modern dictionaries are mostly descriptive and no longer prescribe outdated forms, they increasingly list unaccented forms, though some dictionaries, such as the Oxford English Dictionary, do not list the unaccented variants of particular words (e.g., soupçon).
Words that retain their accents often do so to help indicate pronunciation (e.g. frappé, naïve, soufflé), or to help distinguish them from an unaccented English word (e.g. exposé, résumé, rosé). Technical terms or those associated with specific fields (especially cooking or musical terms) are less likely to lose their accents (such as the French soupçon, façade and entrée).
Some Spanish words with the Spanish letter ñ have been naturalised by substituting English ny (e.g. Spanish cañón is now usually English canyon, Spanish piñón is now usually English pinyon pine). Certain words like piñata, jalapeño and quinceañera are usually kept intact. In many instances the ñ is replaced with the plain letter n. In words of German origin, the letters with umlauts ä, ö, ü may be written ae, oe, ue. This could be seen in many newspapers during World War II, which printed Fuehrer for Führer. However, today umlauts are usually either left out, with no e following the previous letter, or in sources with a higher Manual of Style (such as The New York Times or The Economist) included as German. Zurich is an exception since it is not a case of a "dropped umlaut", but is a genuine English exonym, used also in French (from Latin Turicum) written without the umlaut even alongside other German and Swiss names that retain the umlaut in English.
Accent-addition and accent-removal
As words are naturalized into English, sometimes diacritics are added to imported words that originally didn't have any, often to distinguish them from common English words or to otherwise assist in proper pronunciation. In the cases of maté from Spanish mate (//; Spanish: ['mɑ tɛ]), animé from Japanese anime, and latté or even lattè from Italian latte (//; Italian: i), an accent on the final e indicates that the word is pronounced with a long A sound (the diphthong // ( listen), AY) at the end, rather than the e being silent. Examples of a partial removal include resumé (from the French résumé) and haček (from the Czech háček) because of the change in pronunciation of the initial vowels. Complete naturalization stripping all diacritics also has occurred, in words such as canyon, from the Spanish cañón. For accurate readings, some speech writers differentiate lēad (pronounced like leed) and lĕad (pronounced like led). Not to be forgotten are adjectives such as learnèd and belovèd, which are pronounced with two and three syllables respectively, unlike the past participles learned and beloved, which are each pronounced with one fewer syllable.
In Canadian English, words of French origin retain their orthography more often than in other English-speaking countries, such as the usage of é (e with acute) in café, Montréal, née, Québec, and résumé. This is due to the large influence afforded by French being one of Canada's two official languages at the federal government level as well as at the provincial level in New Brunswick and Manitoba, and the majority and sole official language in Québec.
New Zealand English includes words derived from the Māori language, which uses a macron (Māori: tohutō) to indicate vowel length. Until the early 2000s, the technical capacity to display macrons in print and online was limited, and long vowels were indicated with umlauts (Mäori) or doubled vowels (Maaori). Since 2000, macrons are increasingly common in New Zealand English; both of the main newspaper chains had adopted macrons in their print and online editions by May 2018.
Names with diacritics
Diacritics are used in the names of some English-speaking people:
- British: Charlotte Brontë, Emily Brontë (and other members of the Brontë family), Noël Coward, Zoë Wanamaker, Zoë Ball, Emeli Sandé, John le Carré
- American: Beyoncé Knowles, Chloë Grace Moretz, Chloë Sevigny, Renée Fleming, Renée Zellweger, Zoë Baird, Donté Stallworth, John C. Frémont, Robert M. Gagné, Roxanne Shanté
- Australian: Renée Geyer, Zoë Badwi
- Hungarian: Gébel family
The early days of metal type printing quickly faced problems of not just simple diacritical marks for English, and accents for French and German, but also musical notation (for sheet music printing) and Greek and Hebrew alphabets (for Bible printing). However problems with representation of diacritical marks continued even in scholarly publishing and dissertations up to the word processor era. The first generation of word processors also had character set limitations, and confusion due to typesetting convention was exacerbated in the character coded environment due to limitations of the ASCII character set.
- Gavin Ambrose, Paul Harris The Fundamentals of Typography (2007) p. 92: "Diacritical marks – Diacritical marks are a range of accents and other symbols, which indicate that the sound of a letter is modified during pronunciation. These are rare in English but relatively common in other languages."
- Bryan A. Garner The Oxford Dictionary of American Usage and Style (2000) p. 100: "Diacritical Marks, also known as 'diacritics', are orthographical characters that indicate a special phonetic quality for a given character. They occur mostly in foreign languages. But in English a fair number of imported terms have diacritical marks" [revised version of text in Garner Garner's Modern American Usage (2009)]
- John Lennard, The Poetry Handbook (2006), p. 57: "Though limited in English the following may be encountered: acute (née) and grave (changèd) accents, modifying vowels or marking stresses; the circumflex (entrepôt), indicating omitted s; the diaeresis (naïf), preventing a diphthong, or umlaut (über), altering Germanic vowels; the cedilla (soupçon), softening c; the tittle (frō [sic]), indicating omitted n or m, or macron (statūs), lengthening vowels; the tilde (Señor), indicating palatalised n; and the breve (drŏll = 'drol', not 'drowle'), shortening vowels...."
- Karen Cheng, Designing Type (2006) p. 212: "The eszett (also spelled esszett or referred to as a 'sharp s') is not a diacritic, but a ligature that occurs only in the German language. In general, the eszett signifies an 'ss' letter combination. The use of the eszett has declined significantly over ..."
- Diacritics & Special Characters – University of North Carolina: "The following diacritics and special characters display: Diacritics: acute Á ; circumflex Â ; grave À ; tilde Ã ; umlaut Ä. Special characters: thorn, lowercase þ ; thorn, uppercase Þ."
- Jukka K. Korpela, Unicode Explained (2006), p. 195: "Many other scripts use ligatures far more often. Ligatures as discussed here should not be confused with characters that originate from ligatures. For example, capital Latin letter "ae" ae (U+00E6) is an independent letter in Norwegian and .."
- diaeresis: December 9, 1998. The Mavens' Word of the Day. Random House.
- Burchfield, R.W. (1996). Fowlers's Modern English Usage (3 ed.). Oxford University Press. p. 210. ISBN 0-19-869126-2.
- Bryan A. Garner, Garner's Modern American Usage (2009) p. 248: "Sometimes they survive indefinitely, but often they fall into disuse as terms are fully naturalised. Nobody today, for example, writes hôtel or rôle."
- Robbin D. Knapp, "German English Words: A Popular Dictionary of German Words Used in English" (2005) p. 108: "When German words with umlauts are assimilated into the English language, they sometimes keep their umlauts (e.g., doppelgänger, Flügelhorn, föhn, Der Freischütz, führer, jäger, kümmel, Künstlerroman, schweizerkäse, über-), but often are ..."
- Diccon Bewes Swiss Watching 2012 "In English, the most daring thing we do now is leave the umlaut off Zürich; not that any British ear would hear the difference anyway. For other official names, such as the houses of parliament, I have given only the German version, as it's the one used most often."
- Te Taura Whiri o te Reo Māori / Māori Language Commission (2012). Māori Orthographic Conventions. Accessed 29 May 2018.
- Keane, Basil (11 March 2010). "Mātauranga hangarau – information technology - Māori language on the internet". Te Ara – Encyclopedia of New Zealand. Retrieved 29 May 2018.
- Crewdson, Patrick (11 September 2017). "Why Stuff is introducing macrons for te reo Māori words". Stuff.co.nz. Retrieved 29 May 2018.
- "Official language to receive our best efforts". New Zealand Herald. 9 May 2018. Retrieved 29 May 2018.
- Simon Eliot, Jonathan Rose, A Companion to the History of the Book (2011) p. 210: "Within a short time, pages in metal type were combined with woodcut illustrations, later to be followed by metal engravings. Hebrew and Greek, with their vowel points and accents, and music posed problems of vertical as well as horizontal .."
- Scholarly publishing (1982), p. 335: "... after printed copies of the dissertation – printed by the traditional letterpress process, from metal type – had been deposited in ... The original languages often required diacritical marks not used in English or an alphabet other than the Roman."
- Rosemary Sassoon Computers and Typography (1993) p. 59: "character set limitations"
- Horst Bunke, Patrick Shen-pei Wang Handbook of character recognition and document image analysis (1997) p. 276: "Confusion due to typesetting convention is exacerbated in the character coded environment due to the unfortunate limitations of the ubiquitous ASCII character set and the lack of a single widely accepted international standard for representation of characters with diacritics"