Language desk
< July 20	<< Jun \| July \| Aug >>	July 22 >

Welcome to the Wikipedia Language Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

July 21[edit]

Why does Wikipedia use Latin letters ?[edit]

I have been thinking about this for the last few hours while I read Wikipedia it is a fluke of nature that our world has chosen to use these symbols to convey meaning as they are completely arbitrary I cannot stop looking at the words and letters and it seems like they have their own style or character but when I think about it in these terms it becomes so alien and strange. How did it come to be that people and languages the world over write with Latin letters? Has the english wikipedia considered to employ another system of writing or sound? technical restrictions 01 00:00, 21 July 2022 (UTC)[reply]

English Wikipedia uses the Roman alphabet for the same reason the Russian Wikipedia uses the Cyrillic alphabet. --<-Baseball Bugs ^{What's up, Doc?} carrots-> 01:43, 21 July 2022 (UTC)[reply]

We have an article on that: History of the alphabet. --Amble (talk) 02:22, 21 July 2022 (UTC)[reply]

Technicalrestrictions01 -- Defamiliarization can sometimes be personally interesting, but the English language has almost always been written with the Latin alphabet (with very marginal intermittent exceptions) since about 800 A.D., so why would this change at Wikipedia? The only other predominant writing system for English was Anglo-Saxon runes before about 800 A.D., but that was only used for short inscriptions (not for extended literary texts)... AnonMoos (talk) 05:36, 21 July 2022 (UTC)[reply]

You want to bring rune upon Wikipedia? Clarityfiend (talk) 07:19, 21 July 2022 (UTC) [reply]

Any writing system, being it the alphabet, το αλφάβητο, the ᚠᚢᚦᚩᚱᚳ, or any of the others that exist worldwide, would be equally arbitrary if it had not been for the long historic connection between language and writing system. For languages that are standardized for the first time, one will have a real choice (which still will be influenced by what is practical and useful for communication with others). --T*U (talk) 08:10, 21 July 2022 (UTC)[reply]

ASCII, in its original 7-bit form, was built around the assumption that the 26-letter Latin alphabet was normative; add lower case, numerals and some punctuation and that was all you needed. All additional characters, such as accented letters, digraphs, currency symbols other than $, and all the world of non-Latin characters, have been more or less a problem to be overcome. Code pages, 8-bit ASCII, Unicode have made it possible to use anything from £ signs (and I have still had recent problems with communicating £ signs in my professional work!) to Tolkienian alphabets, but sometimes it is still easier to stick to the 26 'Latin' characters. -- Verbarson ^talk_edits 10:57, 21 July 2022 (UTC)[reply]

The fact that "the 26-letter Latin alphabet... [with] lower case, numerals and some punctuation was all you needed" was well established, as typewriters fitting that description had been in common use for decades. It's only those weird people in foreign countries or who needed to write foreign languages or something who needed accents, foreign alphabets, foreign currency signs, and all that. (That is, the A in ASCII is for American; it wasn't intended to be an international standard). --174.95.81.219 (talk) 08:23, 22 July 2022 (UTC)[reply]

The earlier Baudot code, invented by a Frenchman, also included only the 'Roman' letters; there appears to be no allowance for accented characters. Its restriction to 5 bits/32 characters may have influenced this. It was not intended for use with a 'typewriter' keyboard. (Later modifications are shown for Cyrillic characters.) -- Verbarson ^talk_edits 10:00, 22 July 2022 (UTC)[reply]

174.95.81.219 -- The framework of ISO/IEC 646 allowed a limited number of diacritic letters to be represented in a nationally-localized 7-bit character set, and ASCII was the U.S. national localized form of ISO/IEC 646... AnonMoos (talk) 21:55, 22 July 2022 (UTC)[reply]

Sure. But ASCII (1) came first, and (2) is what Verbarson was talking about. --174.95.81.219 (talk) 23:39, 22 July 2022 (UTC)[reply]

In the Modern era, unwritten languages actually often tend to become written in the Latin alphabet (possibly with certain additional IPA characters), since it is considered more international, in my impression. If there isn't any other major script around the area, it seems uncommon to have a new script made out of thin air. 惑乱 Wakuran (talk) 12:42, 21 July 2022 (UTC)[reply]

In many cases, the choice of writing system has been a political and/or cultural statement. When Turkey decided to use the Latin alphabet instead of Arab script in the 1920s, it was part of a campaign for connecting the new Turkish state to the western world. Similarly, the writing systems of Azerbaijani, Turkmen and Uzbek have been changed to Latin after 1991, while Tajik may also be written with Perso-Arabic script. Also, the choice of Latin and/or Cyrillic for the languages of the countries of former Yugoslavia has clear poltical connotations. For smaller languages, the Pomak language has been attempted standardized in Latin, Cyrillic and Greek script through the years. No prize for guessing which script was proposed by linguists of which country. --T*U (talk) 11:50, 22 July 2022 (UTC)[reply]

Wakuran -- Some previously-unwritten languages in the Soviet Union (or languages previously written very inadequately with the Arabic alphabet) were given Latin-based alphabets in the 1920s, which were then changed to Cyrillic-based alphabets under Stalin. Some of those not mainly spoken in Russia switched back to Latin again after the end of the Soviet Union... AnonMoos (talk) 21:55, 22 July 2022 (UTC)[reply]

True, though see Canadian Aboriginal syllabics for an extremely successful exception. Matt Deres (talk) 12:32, 23 July 2022 (UTC)[reply]

Βι κουντ γιουζ ντα Γρικ άλφαμπετ, μπατ γιου μαϊτ χεβ α χαρντ ταϊμ αντερστάντινγ ιτ. --Λαμπιαμ

Νοτ ιφ γιου αρ Γρικ. --T*U (talk) 14:32, 21 July 2022 (UTC)[reply]

Modern Greek is an amazingly terrible choice for transcribing things, but there are worse things. 好餓包特柴尼思？ —Kusma (talk) 10:43, 22 July 2022 (UTC)[reply]

English Wikipedia naturally uses the standard written script for the English language. Are you wondering why English uses Latin letters? Then, it has already been answered above. 惑乱 Wakuran (talk) 12:42, 21 July 2022 (UTC)[reply]

Do you want to use the Arabic script instead? Persian is another Indo-European language that uses it so it should fit English really well. One YouTuber (subscribers appreciated) called Language Simp said that you’re a gigachad if you learn it without the vowel diacritics. 193.210.175.209 (talk) 14:20, 21 July 2022 (UTC)[reply]

The Arabic script seems quite ill-fitted to a vowel-heavy language such as English, I'd say. It would require heavy modification. 惑乱 Wakuran (talk) 16:23, 21 July 2022 (UTC)[reply]

Honestly, the Latin alphabet isn't much better. English has in the range of 45ish distinct phonemes, of which 26 letters and no diacritical or accent marks is actually VERY poorly designed for. A well-designed alphabet would have a roughly 1-1 correspondence between symbol and phoneme. English falls far short of that. English orthography is notoriously confusing, there are hundreds of memes out there demonstrating exactly how bad it is. --Jayron 32 17:06, 21 July 2022 (UTC)[reply]

I think it was pretty good a few centuries back, though... ;) 惑乱 Wakuran (talk) 21:04, 21 July 2022 (UTC)[reply]

Except it wasn't pretty good a few centuries back. It was picked because the smart people spoke Latin, and they decided to press the latin script into use in vernacular languages, not because it was "pretty good" but because it was Latin. --Jayron 32 11:56, 22 July 2022 (UTC)[reply]

The other germanic languages have about the same number of phonemes as English, although not exactly the same set, and although most use some diacritical marks, none are heavy users of such marks. With a small number of digraphs and some clever rules about doubling or singling letters you can get a long way. Dutch and Norwegian have far a less confusing orthography than English. So obviesli, wi coud rajt Inglisj with e far simpler otthoggrafi than currentli juuzd, but ther is thi signifikant problem thet diffrent dajjalekts of Inglisj woud prifur diffrent otthoggrafiis. Mebi it's tajm tu offisjalli split it intu multipel lengwidzjez. Baj the wee, Aj hev thi impressjen thet ol unstrest vauwels in Inglisj saund abaut thi seem, so it isn't thet vauwel-hevvi. Inglisj vauwel reduksjen apeers tu bi pretti ekstriim. Or is it maj impressjen es e non-netiv spiker? (Apply Standard Average European pronunciation rules to the above and it should be clear. It may not be entirely consistent; I haven't worked out all the details.) PiusImpavidus (talk) 18:22, 22 July 2022 (UTC)[reply]

When I read the above, my mental ear hears Mr. Mark Rutte speaking. --Lambiam 19:55, 22 July 2022 (UTC)[reply]

Obviously. I used mostly Dutch spelling rules for the above, using the regular Dutch spellings for the Dutch phonemes most similar to the English phonemes. Dutch is closely related to English, has a similar number of consonant and vowel phonemes, uses no accent marks in writing (except some loans) and has a fairly simple spelling. If you use Dutch rules to pronounce the above, you get English with a Dutch accent. Which isn't too bad; Mark Rutte has a pronounced accent when speaking English, but is easy to follow. The same applies to Nordic accents. Now, shift the sounds a little bit, diphthongise the stressed vowels slightly, reduce the unstressed ones and you get English with a regular spelling in Latin alphabet without diacritical marks, QED. PiusImpavidus (talk) 07:52, 26 July 2022 (UTC)[reply]

Jayron32 -- There are at least two past approaches to having a separate letter for most or all English phonemes which went beyond the stage of being merely personal theoretical proposals, and achieved some publicity and a limited degree of real-world use: The Shavian alphabet and the Initial Teaching Alphabet. Neither one set the world on fire, and they're more or less curiosities now. (Some would add Unifon as a third.) AnonMoos (talk) 21:55, 22 July 2022 (UTC)[reply]

Note that Mustafa Kemal Atatürk changed the writing system of the Turkish language from Arabic script to Latin in 1928, on the grounds that it would be easier for illiterate people to master. Alansplodge (talk) 11:46, 22 July 2022 (UTC)[reply]

As I have understood it, it was improved orthographically in the process. There were fewer minimal pairs that weren't separated in writing. 惑乱 Wakuran (talk) 12:41, 22 July 2022 (UTC)[reply]

Not just minimal pairs. Ottoman Turkish alphabet § Sound–letter correspondence gives this striking example: the Turkish words written today gevrek (biscuit), kürk (fur), kürek (shovel), körük (bellows) and görek (view), the modern orthography of which reflects their different pronunciations, all used to be written كورك kwrk. (I am not familiar with the word görek; it may have fallen in disuse.) --Lambiam 19:50, 22 July 2022 (UTC)[reply]

Speaking of symbols being arbitrary, one writing system that is mostly non-arbitrary is Hangul: a good proportion of the symbols either resemble the shape of the tongue when it makes the sound or are derived from the symbols of related sounds. --Theurgist (talk) 03:13, 22 July 2022 (UTC)[reply]

The usual term to describe it is "Featural alphabet" or Featural writing system. In the 19th century, Visible Speech was a well-known attempt at such a system, but is quite obscure now... AnonMoos (talk)