Latin script

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Latin
Roman
Caslon-schriftmusterblatt.jpeg
Type Alphabet
Languages Latin, Romance languages, Germanic languages, many others
Time period ~700 BC–present
Parent systems
Child systems Numerous: see Alphabets derived from the Latin
Sister systems Cyrillic
Coptic
Armenian
Runic/Futhark
ISO 15924 Latn, 215
Direction Left-to-right
Unicode alias Latin
Unicode range See Latin characters in Unicode
Note: This page may contain IPA phonetic symbols.

The Latin script is the basis for the most language-specific alphabets of any writing system,[1] as well as of the International Phonetic Alphabet. Its original use was in the Latin alphabet, while the most widespread modern letters are those of the ISO basic Latin alphabet.

Contents

[edit] Spread

The Latin alphabet spread, along with the Latin language, from the Italian Peninsula to the lands surrounding the Mediterranean Sea with the expansion of the Roman Empire. The eastern half of the Empire, including Greece, Turkey, the Levant, and Egypt, continued to use Greek as a lingua franca, but Latin was widely spoken in the western half, and as the western Romance languages evolved out of Latin, they continued to use and adapt the Latin alphabet.

With the spread of Western Christianity during the Middle Ages, the alphabet was gradually adopted by the peoples of northern Europe who spoke Celtic languages (displacing the Ogham alphabet) or Germanic languages (displacing earlier Runic alphabets) or Baltic languages, as well as by the speakers of several Uralic languages, most notably Hungarian, Finnish and Estonian. The alphabet also came into use for writing the West Slavic languages and several South Slavic languages, as the people who spoke them adopted Roman Catholicism. The speakers of East Slavic languages generally adopted Cyrillic along with Orthodox Christianity. The Serbian language uses both alphabets, with Cyrillic predominating in official (and Latin in everyday) communication.[citation needed]

As late as 1492, the Latin alphabet was limited primarily to the languages spoken in Western, Northern, and Central Europe. The Orthodox Christian Slavs of Eastern and Southeastern Europe mostly used Cyrillic, and the Greek alphabet was in use by Greek-speakers around the eastern Mediterranean. The Arabic alphabet was widespread within Islam, both among Arabs and non-Arab nations like the Iranians, Indonesians, Malays, and Turkic peoples. Most of the rest of Asia used a variety of Brahmic alphabets or the Chinese script.

Latin alphabet world distribution. The dark green areas shows the countries where this alphabet is the sole main script. The light green shows the countries where the alphabet co-exists with other scripts. Note that the Latin alphabet is sometimes extensively used even in areas coloured grey due to use of unofficial second languages (e.g. French in Algeria or English in Egypt) and Latin transliterations of the official language (practised to some degree in most countries with a non-Latin alphabet, e.g., pinyin in China).

Over the past 500 years, the Latin alphabet has spread around the world, to the Americas, Oceania, and parts of Asia, Africa, and the Pacific with European colonization, along with the Spanish, Portuguese, English, French, Swedish and Dutch languages. The Latin alphabet is also used for many Austronesian languages, including the languages of the Philippines, and the official Malaysian and Indonesian languages, replacing earlier Arabic and indigenous Brahmic alphabets. Some glyph forms from the Latin alphabet served as the basis for the forms of the symbols in the Cherokee syllabary developed by Sequoyah; however, the sounds of the final syllabary were completely different. L. L. Zamenhof used the Latin alphabet as the basis for the alphabet of Esperanto. The Latin alphabet was chosen for the Ido language due to its global predominance.

In the late nineteenth century, the Romanians adopted the Latin alphabet, primarily because Romanian is a Romance language. The Romanians were predominantly Orthodox Christians, and their Church had promoted Cyrillic prior to that.

Under French rule and Portuguese missionary influence, the Latin alphabet was adapted for writing the Vietnamese language, which had previously used Chinese-like characters.

In 1928, as part of Kemal Atatürk's reforms, Turkey adopted the Latin alphabet for the Turkish language, replacing the Arabic alphabet. Most of Turkic-speaking peoples of the former USSR, including Tartars, Bashkirs, Azeri, Kazakh, Kyrgyz and others, used the Latin-based Uniform Turkic alphabet in the 1930s, but in the 1940s all those alphabets were replaced by Cyrillic. After the collapse of the Soviet Union in 1991, several of the newly independent Turkic-speaking republics, namely Azerbaijan, Uzbekistan, and Turkmenistan, as well as Romanian-speaking Moldova, have officially adopted the Latin alphabet for Azeri, Uzbek, Turkmen, Kazakh, Tatar, and Romanian respectively. Kyrgyzstan, Tajikistan, and the breakaway region of Transnistria kept the Cyrillic alphabet, chiefly due to their close ties with Russia. In the same periods during the 1930s and 1940s, the majority of Kurds throughout the Kurdistan region replaced their use of the Arabic alphabet for writing in the Kurdish language by adopting two forms of the Latin alphabet.

Although today the only official Kurdish government located in Iraq uses the Arabic alphabet for public documents, the Latin alphabet remains widely used throughout the region by the majority of Kurdish-speakers.

[edit] Extensions

In the course of its use, the Latin alphabet was adapted for use in new languages, sometimes representing phonemes not found in languages that were already written with the Roman characters. To represent these new sounds, extensions were therefore created, be it by adding diacritics to existing letters, by joining multiple letters together to make ligatures, by creating completely new forms, or by assigning a special function to pairs or triplets of letters. These new forms are given a place in the alphabet by defining an alphabetical order or collation sequence, which can vary with the particular language.

[edit] Ligatures

A ligature is a fusion of two or more ordinary letters into a new glyph or character. Examples are ⟨Æ/æ⟩ (from ⟨AE⟩, called "ash"), ⟨Œ/œ⟩ (from ⟨OE⟩, sometimes called "oethel"), the abbreviation&⟩ (from Latin et "and"), and the German symbol ⟨ß⟩ ("sharp S" or "eszet", from ⟨ſz⟩ or ⟨ſs⟩, the archaic medial form of ⟨s⟩, followed by a ⟨z⟩ or ⟨s⟩).

[edit] Wholly new letters

Some examples of new letters to the standard Latin alphabet are the Runic letters wynnǷ/ƿ⟩ and thorn ⟨Þ/þ⟩, and the letter eth ⟨Ð/ð⟩, which were added to the alphabet of Old English. Another Irish letter, the insular g, developed into yogh ⟨Ȝ/ȝ⟩, used in Middle English. Wynn was later replaced with the new letter ⟨w⟩, eth and thorn with ⟨th⟩, and yogh with ⟨gh⟩. Although the four are no longer part of the English or Irish alphabets, eth and thorn are still used in the modern Icelandic and Faroese alphabets.

Some West, Central and Southern African languages use a few additional letters which have a similar sound value to their equivalents in the IPA. For example, Adangme uses the letters ⟨Ɛ/ɛ⟩ and ⟨Ɔ/ɔ⟩, and Ga uses ⟨Ɛ/ɛ⟩, ⟨Ŋ/ŋ⟩ and ⟨Ɔ/ɔ⟩. Hausa uses ⟨Ɓ/ɓ⟩ and ⟨Ɗ/ɗ⟩ for implosives, and ⟨Ƙ/ƙ⟩ for an ejective. Africanists have standardized these into the African reference alphabet.

[edit] Digraphs and trigraphs

Main articles: Digraph and Trigraph

A digraph is a pair of letters used to write one sound or a combination of sounds that does not correspond to the written letters in sequence. Examples are ⟨ch⟩, ⟨rh⟩, ⟨sh⟩ in English, or the ⟨Dutch ij⟩ (note that ⟨ij⟩ is capitalized as ⟨IJ⟩ or the ligature ⟨IJ⟩ and sometimes as the single letter ⟨Y⟩ despite it being a different letter, but never as ⟨Ij⟩, and that it often takes the appearance of a ligature ⟨ij⟩ very similar to the letter ⟨ÿ⟩ in handwriting). A trigraph is made up of three letters, like the Germansch⟩, the Bretonc’h⟩ or the Milanese ⟨oeu⟩. In the orthographies of some languages, digraphs and trigraphs are regarded as independent letters of the alphabet in their own right. The capitalization of digraphs and trigraphs is language-dependent, as only the first letter may be capitalized, or all component letters simultaneously (even for words written in titlecase, where letters after the digraph or trigraph are left in lowercase).

[edit] Diacritics

The letter ⟨a⟩ with an acute diacritic.

A diacritic, in some cases also called an accent, is a small symbol which can appear above or below a letter, or in some other position, such as the umlaut sign used in the German characters ⟨ä⟩, ⟨ö⟩, ⟨ü⟩. Its main function is to change the phonetic value of the letter to which it is added, but it may also modify the pronunciation of a whole syllable or word, or distinguish between homographs. As with letters, the value of diacritics is language-dependent.

[edit] Collation

Modified letters such as the symbols ⟨å⟩, ⟨ä⟩, and ⟨ö⟩ may be regarded as new individual letters in themselves, and assigned a specific place in the alphabet for collation purposes, separate from that of the letter on which they are based, as is done in Swedish. In other cases, such as with ⟨ä⟩, ⟨ö⟩, ⟨ü⟩ in German, this is not done, letter-diacritic combinations being identified with their base letter. The same applies to digraphs and trigraphs. Different diacritics may be treated differently in collation within a single language. For example, in Spanish the character ⟨ñ⟩ is considered a letter, and sorted between ⟨n⟩ and ⟨o⟩ in dictionaries, but the accented vowels ⟨á⟩, ⟨é⟩, ⟨í⟩, ⟨ó⟩, ⟨ú⟩ are not separated from the unaccented vowels ⟨a⟩, ⟨e⟩, ⟨i⟩, ⟨o⟩, ⟨u⟩.

[edit] Romanization

Words from languages natively written with other scripts, such as Arabic or Chinese, are usually transliterated or transcribed when embedded in Latin text or in multilingual international communication, a process termed Romanization.

Whilst the Romanization of such languages is used mostly at unofficial levels, it has been especially prominent in computer messaging where only the limited 7-bit ASCII code is available on older systems. However, with the introduction of Unicode, Romanization is now becoming less necessary. Note that keyboards used to enter such text may still restrict users to Romanized text, as only ASCII or Latin-alphabet characters may be available.

[edit] English alphabet

As used in modern English, the Latin alphabet consists of the following characters

Majuscule Forms (also called uppercase or capital letters)
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Minuscule Forms (also called lowercase or small letters)
a b c d e f g h i j k l m n o p q r s t u v w x y z

In addition, the ligaturesÆ⟩ of ⟨A⟩ with ⟨E⟩ (e.g. "encyclopædia"), and ⟨Œ⟩ of ⟨O⟩ with ⟨E⟩ (e.g. "cœlacanth") may be used, optionally, in words derived from Latin or Greek, and the diaeresis mark is sometimes placed on the letters ⟨o⟩, ⟨i⟩ and ⟨e⟩ (e.g. "coöperate", "naïve" or "preëxisting") to indicate the pronunciation of ⟨oo⟩, ⟨ai⟩ or ⟨ee⟩ as two distinct vowels, rather than a long one. Hyphenation may also be used, to avoid having to type accented characters: "co-operate" or "pre-existing". Outside of professional papers on specific subjects that traditionally use ligatures in loanwords, however, ligatures and diaereses are seldom used in modern English. Note, however, that some fonts for typesetting English contain commonly used ligatures, such as for ⟨tt⟩, ⟨fi⟩, ⟨fl⟩, ⟨ffi⟩, and ⟨ffl⟩. These are not independent letters, but rather allographs.

[edit] Latin alphabet and international standards

By the 1960s it became apparent to the computer and telecommunications industries in the First World that a non-proprietary method of encoding characters was needed. The International Organization for Standardization (ISO) encapsulated the Latin alphabet in their (ISO/IEC 646) standard. To achieve widespread acceptance, this encapsulation was based on popular usage. As the United States held a preeminent position in both industries during the 1960s the standard was based on the already published American Standard Code for Information Interchange, better known as ASCII, which included in the character set the 26 x 2 letters of the English alphabet. Later standards issued by the ISO, for example ISO/IEC 10646 (Unicode Latin), have continued to define the 26 x 2 letters of the English alphabet as the basic Latin alphabet with extensions to handle other letters in other languages.

Aa Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn Oo Pp Qq Rr Ss Tt Uu Vv Ww Xx Yy Zz
Related

[edit] See also

[edit] References

  1. ^ Haarmann 2004, p. 96

[edit] External links

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages