Wikipedia talk:Manual of Style (Quotation marks and apostrophes)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

WikiProject Manual of Style
WikiProject icon This redirect falls within the scope of WikiProject Manual of Style, a drive to identify and address contradictions and redundancies, improve language, and coordinate the pages that form the MoS guidelines.

This page is for discussing the existing policy of prohibiting typographically correct quotation marks. The WP:MOS section in question is Wikipedia:Manual of Style#Use straight quotation marks and apostrophes. For some background, see the Wikipedia entries on apostrophe (mark) and quotation mark.

I have copied the discussion from WP:MOS below, and moved it to its own page (this one) to give the debate more focus. The proposal below, by User:Susvolans, and supported by others including myself is to

  • drop the requirement to use straight quotes.

For example, the section in WP:MOS could be simply removed without replacement.

Things to think about:

  • Should Wikipedia express preference of one style of quotation marks or the other? For example, the MOS could favour proper marks, but cheerfully accept straight ones. (I am undecided on this issue.)
  • The most immediate effect of allowing proper quotes would be that the titles of many articles can change. This would improve the typographic quality of Wikipedia a lot (titles are set in large type, so even Windows users would notice the change, which seems to be otherwise hidden to them in the default font and magnification). Because of the way English forms possessives, there are lots and lots of articles that could (and should, IMO) be moved. Mother's day to Mother’s day, etc. I will be overjoyed when that happens, but I also predict that this will be the greatest point of debate as soon as proper quotes are allowed, and we might as well discuss it here and now.

There are some related issues that are more complicated and which we can discuss as well, even though I believe they remove attention from the proposal itself:

  • Should Wikimedia support this, and if so: how? For example, should the editor implement some kind of “smart quotes” akin to many text processing applications? Should there at least be buttons? Or should some yet-to-be-invented combination of characters produce the proper quotes. (I think no because I am on a Mac and quite used to entering my own quotes. Others may disagree.)

To pre-empt some arguments: For a Wikipedia that supports proper quotes, check the German Wikipedia. As far as I can see, the issue of straight quotes isn’t even mentioned in the German MOS, and the whole project seems to work wonderfully anyway. I assume that people who don’t care or aren’t able to enter the proper quotes just use straight quotes, and other friendly souls with too much time on their hands change them. (German keyboards don’t have curly quotes either, and they run the same Wikimedia software as the English version does.) The situation seems to be the same for French Wikipedia (but I am not sure). Arbor 07:03, 3 Jun 2005 (UTC)

History of the current rule[edit]

I have had a look at how this rule came about, and what discussion preceded or followed it. The current rule in the MoS is this, basically:

For uniformity and to avoid complications use straight quotation marks and apostrophes ( ' " ) not curved (smart) ones, grave accents or backticks ( ‘ ’ “ ” ` ).

It was added on 11 Apr 2003 by User:Patrick. Here is the MoS before his addition. There were some formatting issues, and the rule introduced some problems with people who use MS Word (which inserts curlies using a "smart quote" feature), so a number of subsequent edits within the next hours fleshed out the section, which then looked like this, still on 11 Apr 2003:

For uniformity and to avoid complications use straight quotation marks and apostrophes: ' " not curved (smart) ones or the "backtick": ‘ ’ “ ” `

If you are pasting text from Microsoft Word, remember to turn off the smart quotes feature, unmark this feature in AutoEdit and "AutoEdit during typing"!

This is basically the same as it appears today. (Some more complications with "smart quotes" are addressed, and there is a long paragraph about the ʻokina letter.)

Here is the talk page from that day (actually, the next day). As far as I can see, there is no mention of the rule here or anywhere else, let alone a discussion or motivation. From the lack of reaction it seems like a consensus decision at the time that was never questioned. (Maybe somebody else is better at wading through the archives.)

There are quite a few discussion about quotation marks later on, by they all are concerned about the placement of quotes in relation to other puncuation, or about 'okina, as far as I can see. If I'm missing something, please add it here. Arbor 13:49, 15 Jun 2005 (UTC)

MediaWiki 1.5: time to drop straight quotation mark requirement?[edit]

When MediaWiki 1.5 comes out, the English language Wikipedia will switch to Unicode, and curly quotes can be put safely into the article source. I can see no good reason to keep WP:MOS#Use straight quotation marks and apostrophes after the switch. Susvolans (pigs can fly) 17:30, 9 May 2005 (UTC)

You mean apart from the fact that we have thousands upon thousands of articles that already uniformly have straight quotation marks and apostrophes? :)
If we can keep it enforced, I think I'd prefer to try to keep them all straight, jguk 18:14, 9 May 2005 (UTC)
Those of us with a taste for professional typography would prefer proper quote marks. The only things that have been preventing more widespread adoption of proper quotes have been that it makes the wiki source hard to read and they can be hard to input. With UTF-8 wikisource and the "insert special characters" box on the edit pages both of these problems are obviated. It will not be hard to implement a robot to fix all the quotes in current articles. Implementation of professional typography (including proper quotes) is one of the two major remaining issues that makes Wikipedia unsuitible for professional print publication (the other being a drive to upload more images with print-quality resolution).
Definitely support officially preferring proper quotes, but of course cheerfully tolerating straight quotes.Nohat 22:04, 9 May 2005 (UTC)
The beauty of the wiki is that they're easy to edit. Introducing that would make it more difficult. violet/riga (t) 22:28, 9 May 2005 (UTC)
They would just appear as ordinary punctuation in the edit box. How is that harder to edit? Nobody would be forced to use them. Surely templates with parameters are far more complex than some curly quote marks, yet we have those... Nohat 00:56, 10 May 2005 (UTC)

Smart quotes would be a real pain for those of us who sometimes use text editors to mark up articles, as they often copy in strange ways. Jonathunder 06:07, 2005 May 10 (UTC)

That's going to happen anyway with all the other special unicode characters like dashes and Greek letters, and mathematical symbols. Nohat 07:11, 10 May 2005 (UTC)
Greek letters, mathematical symbols, and other special symbolic whatnot do not occur that often in most articles. I've even edited a few math articles in simple text editors without hitting a problem. Dashes can be a problem sometimes. Smart quotes, if not implemented with a great deal of care and consideration of cross-platform editing tools, could be a bigger problem than dashes. Jonathunder 02:06, 2005 May 12 (UTC)
I don't understand all the technical issues, but I would love to be able to use proper quotation marks. I always use proper typography on my own website and writing. I especially hate when I'm copying a quotation to Wikipedia:Press coverage and I have to change the proper typography to those ugly straight quotation marks. I sincerely hope we can find a way to implement this change as I've been waiting for it since I joined. — Knowledge Seeker 06:34, 10 May 2005 (UTC)

This sounds absolutely stupid to me. None of my keyboards have curly quotes on them, and I'm unlikely to go to the trouble of searching for a way to enter them (an attitude that I suspect will extend to the vast bulk of Wikipedia editors). Noisy | Talk 16:24, May 11, 2005 (UTC)

Tons of GUI programs have an algorithm called "smart quotes" built in to them—you type straight quotes, and the algorithm converts it to curlies if necessary based on rules as to the context of the quotes—it usually changes other punctuation characters and ligatures as well. So you type
"Hello--I'm typing 'straight quotes,' my dear AEvar."
and you get
“Hello—I’m typing ‘straight quotes,’ my dear Ævar.”
I may have missed the implications of using quote characters for italics and bold, but it should be possible to have smart quotes in the renderer, so that the wikitext keeps straight quotes but curlies get displayed in the HTML. So why not agitate for that instead? (Obviously it would have to be a language-specific feature.) Having just typed that example, I'm convinced by those who say this is too hard to type—and I'm on a Mac. It looks like on my Windows machine, I have to hold down Alt and press a sequence of numbers on the keypad—that's just ridiculous. It's definitely not wikiwiki (quick). TreyHarris 16:57, 11 May 2005 (UTC)

I have to say that a lot of people feel strongly about this issue, since, as most people don't seem to realize, the use of straight quotes is a bastardization of proper punctuation that was introduced with the modern keyboard. That said, I think we should look into using the TeX style of quotes, where the apostrophe is translated into a right quote, and the backtick is translated into a left quote. That way people would just have to get used to writing,

``Hello--I'm typing 'straight quotes,' my dear AEvar.


“Hello—I’m typing ‘straight quotes,’ my dear Ævar.”

Which is just as "wikiwiki", since it only introduces two more keystrokes per quotation. Also, it is backwards-compatible with the current use of straight double quotes. The only issue would be straight single quotes, which would not translate correctly in their current usage. —Sean κ. 18:05, 11 May 2005 (UTC)

You must mean making the wikitext renderer convert (``) to (“), right? This would be going from markup in the database that is at least acceptable type-writing practice (if not typographically correct), to markup which is completely non-standard and rather ugly, to boot. Why replace a single character with two? And it doesn't address most of the problem: how to type single quotation marks, apostrophes, en dashes, em dashes, and figure dashes.
Unicode correctly solves this entire problem—just type the actual characters you want to enter. Why create an abstraction for plain text?
On a Mac it’s easy enough for anyone to type quotation marks and apostrophes without stretching their brain. I can’t believe there isn’t a single solution or add-on for Windows text fields out there in the world! Michael Z. 2005-05-11 18:52 Z
It's not non-standard if you're used to writing in TeX. But I just realized that it wouldn't work, since two apostrophes, '', already have a meaning in wiki markup. —Sean κ. 06:03, 12 May 2005 (UTC)
Oh [[deity of your choice]] no ... please noooo ... Firstly, can you image the overhead on the database as people decide they will change every quote in every article (and even if we tell them please don't they will) but also not everyone will be (a) browsing using a unicode-acceptable browser, (b) editing using a 'rich' text editor. I could *only* support this if every keyboard in use worldwide to edit WP had left- and right- single- and double- apostrophe keys for the direct input of these characters. Just as with the input of other special characters, errors happen because of misentry, and the javascript version below each edit box (if turned on) isn't a great deal of use as the characters are (ime) too small to see accurately too. For the avoidance of doubt, therefore, a strongly against this proposal from me. --Vamp:Willow 18:34, 11 May 2005 (UTC)
I have to agree with no here - first, not eveyone thinks the curly ones look better; second, direct entry is much more difficult for many users; three, it changes the software and/or the usage in a way that makes things more complicated Trödel|talk 20:00, 11 May 2005 (UTC)
(a) is there really a browser still in use that can't deal with typographic quotation marks? Even the Lynx text-based browser displays these acceptably on an ISO-Latin or pure-ASCII display.
If typographic quotation marks display correctly, but look worse in your browser, then it sounds like an issue with poorly-designed fonts or bad font rendering. This web site's display should be aimed at working acceptably in the average browser, but let's not use the wrong character because some font has an ugly version of the right one. Curly quotes have worked in all mainstream browsers at least since 2001.
(b) I agree that no one should be required to type typographic characters that aren't standard on their keyboard layout. A smart-quotes renderer built into Wikimedia would completely eliminate any differences in typographic quotes and typewriter quotes in articles, but even without one I don't see it as a problem. Michael Z. 2005-05-11 21:26 Z

As I pointed out above, it would be perfectly possible to put the proper quote marks into the special character insert box that appears on the edit page. Then all you would have to do is click on the quote marks that you want. Do the people who oppose curly quotes in the wikitext also oppose accented letters? Nohat 21:08, 11 May 2005 (UTC)

I'm sorry but there is no way I want to click on a button to insert specific symbols when all I currently have to do is press <SHIFT><2>! I can't even begin to think how much that would slow my typing down. By all means use smart-quotes (perhaps preference-configurable), but they shouldn't be required when writing the text. violet/riga (t) 21:49, 11 May 2005 (UTC)
I certainly wouldn't want to require their use. All I want is permission to use them. — Knowledge Seeker 22:08, 11 May 2005 (UTC)
  • Strong support. Besides being "right," smart quotes" will make the Wiki look better. To steal a phrase, Wikipedia is Not a Typewriter. --FCYTravis 22:44, 11 May 2005 (UTC)
  • Oppose. No thank you. To echo others, this would merely be an annoyance. Being able to store text in Unicode does not mean we should make it harder to edit. Now that greek characters, etc. will appear in-line I will probably have to compose them in another window and paste them in--irritating enough for those few special cases. I will not be doing that every time I use a quote (or a dash, for that matter). Demi T/C 02:56, 2005 May 13 (UTC)
  • Support with enthusiasm. Especially, I would like to see the death of the blanket prohibition of correct quotation marks, and the implied rule against typographer’s apostrophes. A technology-induced typographical restriction in the MoS needs to go the way of the Dodo when the technology actually improves. Issues of keyboard layout are tied to operating systems and shouldn't really muddle the debate too much; many Mac users have used to typographic punctuation signs for decades. (By analogy, Windows keyboards don't prevent Wikipedia from using proper dashes, so they shouldn't prevent proper quotation marks either.) On the other hand, I am no so sure about having the Wikimedia software support or enforce “smart quotes”. There are a gazillion issues here (for example, multilingual support) that go far beyond the relatively modest proposal of removing the no typographer’s quotes rule from the MoS. In any case, the most visible change to Wikipedia will likely not be the changed quotation marks, but the changed article titles. These are set in larger type, and often include possessive apostrophes. Indeed, these titles are probably the greatest annoyance to the typographically trained eye on Wikipedia, and I will perform a happy dance when all those straight typewriter apostrophes get replaced by the proper symbol (see apostrophes). I am confident that a bot can quickly add the necessary redirects from Mother's Day to Mother’s Day. However, this issue would likely benefit from a longer discussion in a more visible place. I am not sure how and where to start such a debate. How should we proceed? Arbor 11:35, 2 Jun 2005 (UTC)
Wading through all the above responses I find precious little commenting on the actual proposal. Most of the opposing voices are against the idea of prohibiting straight quotation marks, which nobody advocates, or against extending Wikimedia software with an engine for “smart quotes”. However, the proposal is just to allow proper typographical marks, which is currently forbidden. So, the bold thing would be to take this debate as evidence of widespread lack of opposition against allowing typographical quotation marks, and to consequently remove the paragraph from MOS. (But I’m a wimp so I don’t dare… I do already feel like quite the vigilante for using proper quotes in this paragraph.) We need a more focussed debate. I also note that the German Wikipedia uses nice curly quotes and apostrophes (and guillemets and fancy dashes and whatnot) and seems to be doing just fine, without Wikimedia help or widespread re-engineering of German keyboards. Arbor 19:46, 2 Jun 2005 (UTC)
You are quite right. — Chameleon 20:14, 2 Jun 2005 (UTC)
  • I don't see how this can be enforceable either way, but personally I would object to smart quotes since it's too easy to do them wrong - both for newbies and for many computer programs. Radiant_* 14:06, Jun 3, 2005 (UTC)
Radiant, are you just opposed to smart quotes (i.e., more-or-less-automatically inserted quotes), or are you objecting to the removal of the ban on typographer’s quotes? Arbor 14:10, 3 Jun 2005 (UTC)
  • Support The US International keyboard (which everyone should use) makes smart apostrophes ez-pz ‘’ Smart quotes probably are too, but I don't regularly use them. This is another instance where griping about keyboards and browsers from Americans looks like whining to international users who've had to put up with alternative input methods since typewriters were invented. SchmuckyTheCat 15:08, 3 Jun 2005 (UTC)
  • Oppose. There are some browsers and editors that by design or flawed default configuration don't handle editing of curly quotes correctly. (I had a problem recently with Opera until I fiddled with its assumptions about page encoding.) Users may find themselves munging pages without realizing it—Opera was inserting a question mark into the text wherever a curly quote appeared—or have to replace all curly quotes with straight quotes in any passage of text they try to edit. This isn't a Good Thing, and until we're confident that the vast, vast majority of users won't have these problems I can't support a change in policy. Note that I strongly oppose the introduction of typographer's quotes into article titles. That would result in a slew of duplicated articles from people who look under the "wrong" title, and make wikilinking that much more of a chore.--TenOfAllTrades (talk/contrib) 18:07, 3 Jun 2005 (UTC)
    • It seems I've been a victim of the problem I describe. The first time I added these remarks, I didn't realize that my browser at work still handled the typographical quotes incorrectly (I fixed it on my home machine a while ago) since it's an issue that seldom comes up here (due to the current Manual of Style provisions against the curly quotes.) Consequently, I've now been accused of violating WP:POINT and my remarks were reverted. This addition appears to render correctly now; please drop me a note if there are further issues. --TenOfAllTrades (talk/contrib) 18:07, 3 Jun 2005 (UTC)
Thank you for your comments. Most of the issues you are mentioning have to do with Unicode support, to which the English Wikipedia will switch. If you really are concerned about these issues then it is high time you made your voice heard about dashes as well, for example, as well as countless other symbols (like diacritical marks) that will be rendered as question marks in what you refer as "some browsers". We can safely assume that all Wikipedia editors will change their external editors (if they employ such) to use Unicode for these reasons already, so I assume questions marks will be inserted by vandals only. However, to be honest I think the migration to Unicode is pretty much a done deal. Thank you also for addressing my ruminations about article titles. I don‘t see any issues that wouldn’t be resolved by a (cheap) redirection. I can find and link to Saint-Saëns without the diaresis already today, I will hopefully continue to be able to do so if the hyphen is replaced by an en-dash (assuming that’s correct – who knows?) and Mother's day will be equally easy to handle. Let me repeat that all these issues seem to be handled each and every day by each and every user and editor of the German and French Wikipedias. They are doing just fine. Hypothetical technical problems that might be induced by fictitious or outdated browsing and editing software (that would choke on Wikipedia CSS anyway) really aren’t very convincing. Arbor 17:22, 3 Jun 2005 (UTC)
  • Support a policy which prefers traditional quote marks and apostrophes but doesn't force anybody to use them if they think it's too hard. Despite what chocolateboy says below, adding "smart quotes" support to the Wiki code will be a lot harder than expected and I wouldn't bother. There are too many unusual cases. The only real problem I see is with some external editors which may convert on cut-and-paste operations. I think it would be much easier to get the Wiki code to detect these than to implement smart quote conversion. Note that on the English Wiktionary we now prefer traditional quotes and apostrophes except in article titles. We use redirects and pipelinks as necessary. — Hippietrail 15:27, 6 December 2005 (UTC)
  • Oppose. The wikitext should use the typewriter quotes that are used by h2g2, Encyclopædia Britannica Online, and every website in Alexa's top 10. This is the web convention. The adoption of print conventions can then easily be handled by a preference or userscript. chocolateboy 13:37, 6 December 2005 (UTC)

Renderer for quotation marks, etc.[edit]

The wiki text renderer should definitely be extended to handle existing wiki text, and convert it to curly quotes, apostrophes, and dashes. This isn’t a trivial problem—apostrophes can mess up the apparent nesting of single quotation marks. Apostrophes can also appear in strange places, and I don’t think even Microsoft has figured out how to make ‘smart quotes’ smart enough. And there are also cases where typewriter quotes shouldn’t be converted.


  • cut ’n’ paste [apostrophes for omitted letters]
  • summer of ’05 [omitted numbers; some ‘smart’ quotes renderers put an opening single quotation mark here]
  • 6′-8″ tall [primes, or even typewriter quotes, should be used for feet and inches]
  • Latitude 49° 53′ N. [ditto for lat./long.]

Michael Z. 2005-05-11 19:05 Z

I don't think even Microsoft has figured out how to make smart quotes smart enough. Indeed. Just google "smart quotes" to see how much mumbling and gnashing of teeth they can cause. Jonathunder 01:57, 2005 May 12 (UTC)
I've never had a problem with the WordPerfect smart quotes. Another reason not to use Word? Mel Etitis (Μελ Ετητης) 22:40, 13 May 2005 (UTC)
I don't see "wrong" smart quotes as a particular issue, as someone who cares about the issue, or a bot, can put the right ones in, leaving the engine to do the bulk of the work. Susvolans (pigs can fly) 12:23, 24 May 2005 (UTC)
I'm surprised no one has mentioned the problem which plagues many websites: Som Rich Text editors (including the market-leading office package from you-know-who) implement their smart quotes by inserting characters such as &146; and &147; from the Windows charater set. Along with all codes between &#0128; and &#0160;, these are prohibited codes in UTF-8, and strict browsers such as Opera show them as unknown symbols. So the wiki software would need to automatically translate these to the actual unicode glyphs. Note also that the decimal notation has the widest support. Dramatic 04:53, 5 Jun 2005 (UTC)
As far as I am aware, the switch to Unicode in the wiki will cure this issue. Susvolans (pigs can fly) 07:00, 6 Jun 2005 (UTC)
A belated reply, but it certainly won't correct it. I've been using UTF-8 for all web pages I write for 2-3 years, and all copy sourced from Word, Wordpad and MS Publisher needs to be corrected. The codes they insert are just as illegal in UTF-8 as they were in other encodings, but it becomes more obvious since some browsers are less tolerant of illegal codes when UTF-8 is declared than in other encodings. It would be necessary to either capture and change illegal codes at input or to run a conversion bot. dramatic 12:11, 14 August 2005 (UTC)
A few assorted objections:
  1. I use NEdit for creating large articles. It doesn't support anything beyond 8-bit ANSI or equivalent. Many other basic text editors have the same limitation.
  2. I use a US-layout keyboard under Linux/X11. I'm sure there's a compose-key combination that will produce curly quotes, but danged if I know what it is. Most people have similar problems.
  3. Smart quote parsing in MediaWiki is a bad idea -- when using smart quotes in Word, I've found about a half-dozen situations where Word chooses the wrong mark.
  4. There are still a few web browsers out there -- such as most for Win98 -- that don't support curly quotes properly. Usually, this results in such quotes being replaced by question marks.
  5. Since most editors don't know how to type curly quotes, heavily-edited articles will have a mix of straight quotes, curly quotes, and the stray question mark.
Carnildo 07:31, 3 Jun 2005 (UTC)
There is a detailed list of browser issues with Unicode wikis in German atde:Wikipedia:Umstellung auf Unicode#Getestete_Browser. Susvolans (pigs can fly) 12:31, 3 Jun 2005 (UTC)

Will the damn boxes I see instead of these "correct" apostrophes go away once we upgrade to MW 1.5? If not, I'm very much so against a change in this MoS item. Using Konqueror 3.3.2 on Mandriva Linux 2005. mav 04:21, 4 Jun 2005 (UTC)

Maveric, let me try to explain this. English Wikipedia is switching to Unicode soon. That means all the funny symbols (ß, ü, ï, greek letters, the arabic letters in the lower left corner for the interwiki links, dashes) will be encoded in UTF-8, which means that in a non-compliant application they become boxes or question marks or nonsense when you view or edit the code. The underlying encoding will change, at that has consequences for how Wikipedia can be viewed. (To be honest, there is a detail here about UTF-8 versus iso-latin, so there is a better chance for ë to survive than for the hebrew letters in the interwiki links, but in any case there will be lots of question marks or boxes.) This is not what we are discussing here. Instead, we are saying that since en-Wikipedia is changing to UTF-8, there really isn't a good reason to continue to prohibit typographer's quotes. Either you have a browser/editor that allows you to read/edit UTF-8 Wikipedia (including the dashes and interwiki links and diareses and ellipses and greek symbols and quotation marks and apostrophes) or you don't. It's silly to allow russian interwiki links in the lower left corner (which will be mangled by each and every editor who uses 7-bit software), but not to allow proper quotation marks. Arbor 07:35, 4 Jun 2005 (UTC)
So let me get this straight. Since we are going to UTF anyway we are going to make the situation *worse* for non-compliant browsers than it need be just to be technically correct? That is technological arrogance. --mav 15:49, 6 Jun 2005 (UTC)
I don't think so. For viewing web pages:
  • All text that's currently encoded using HTML entities will stay the same, and your browser is likely to continue to render it the same as it does now, whether it knows about Unicode (UTF-8) or not.
  • As people start to type raw UTF-8 characters instead of HTML entities, most browsers should continue to render them the same way, too. It's possible that some old browsers can render Unicode entities but not Unicode text, but I don't know of any (Even Explorer 5/Mac doesn't have a problem with this). If such a browser exists and you are using it, see if there's a way to upgrade (does Firefox run on your computer platform?).
  • If you are using software that doesn't support Unicode text at all, then all those non–ISO-8859-1 characters are already broken, and you won't perceive any differenc.
For editing, the story is a bit different. MSIE 5/Mac will destroy raw UTF characters when editing a page, and it is still used by well under one percent of web users (most counters and stats packages don't even report its usage). Text editors that don't handle UTF are in the stone ages, in my opinion. We've had the technology to easily reproduce an opening quotation mark for several thousand years and it's about time that computing caught up. Michael Z. 2005-07-6 17:50 Z
In some cases, software that botches most UTF-8 text will treat typographic quotes and dashes as Windows CP1252 characters. I think Wikipedia will handle these with no problems. Michael Z. 2005-07-6 18:01 Z
See above - CP1252 contains curly quote characters at code points which are illegal in UTF-8 and in all other web encodings (ISO 8859 range, etc). So such fallback is part of the problem. dramatic 12:11, 14 August 2005 (UTC)

Test your Unicode concerns[edit]

I remain pretty disappointed about the quality of the voices who oppose to remove the ban on proper quotes. With a few notable exceptions, all the opposition seems to be against changing the coding system, which is not what we are discussing here.

(There are also some arguments made which seem to oppose a ban against straight quotes. Again, that is not part of the proposal. Some of us care about this. We want to allow proper quotes, just as we allow proper dashes, proper diacritical marks, proper spelling, proper grammar. If an editor doesn’t care about some of these things, that’s fine. Wiki works.)

Quotes are no bigger a problem than all the other UTF-8 encoded symbols that will appear on each and every page, especially on the multilingual interwiki links. If these things really will be a problem, then let’s bring that to the attention of the relevant people. I just don’t think so, and have little patience with arguments that use hypothetical editing environments that may cause problems.

To ground this debate (which I find misguided) I have set up a page on the MediaWiki 1.5 test suite with UTF-8 symbols and with proper quotes. [1] Point you browser to it, try to edit it. This is what Wikepedia will behave like real soon now, straight quotes or no. Tell us what you think.

And let me repeat my original plea: I would like to hear voices that argue why we (uniquely among Wikipedias, as far as I can tell) should keep the straight quotes only policy. The only arguments I’ve seen so far are about typographical consistency, anticipating an ugly mess of different quotation styles. (This isn’t normally an argument we use, see the discussion about dashes, but by all means let’s debate it.)

The current paragraph in the MoS is grounded, at least partially, in technical arguments that will be void when we move to 1.5, so the paragraph at least needs to be rewritten. Arbor 09:19, 6 Jun 2005 (UTC)

Yes this will create a bigger problem because they are *everywhere.* You can not compare quotes and apostrophes to other the things that we allow now but break in non-compliant browsers. That is just silly. Nobody but a very small minority of the most pedantic of people care about so-called proper quotes. People who do not know about proper quote or who do not care *will* notice many problems if they do not have a compliant browser. So little is gained and much is lost by allowing proper quotes. --mav 15:56, 6 Jun 2005 (UTC)
Mav's words reflect my own thoughts and I oppose changing the policy. Simplicity is good. — Stevie is the man! Talk | Work 18:57, Jun 13, 2005 (UTC)
Well, at the danger of sounding snarky, that comment would have been a lot more convincing, had you not chosen to end it with an em-dash. A hyphen would have had the same effect, and been closer to the Wikipedia Is A Typewriter philosophy. Arbor 19:24, 13 Jun 2005 (UTC)
I reserve the right to have inconsistent beliefs. My opposition stands. — Stevie is the man! Talk | Work 19:49, Jun 13, 2005 (UTC)
Further, my position is to not make editing "any more" complicated than it is today, not to take away existing complexities. Besides, &mdash;'s add context in that they are indeed different from hyphens in meaning. Replacing straight quotes with curly quotes doesn't add any new context whatsover, except to look "better," according to some people. — Stevie is the man! Talk | Work 19:55, Jun 13, 2005 (UTC)
No problem, I am fine with your inconsistency, and I hope your extend that sentiment to those who would prefer to allow curly quotes. (Currently, the straight quotes policy is wonderfully consistent, but I would prefer to get rid of it.) Secondly, maybe I am stubborn, but I would still like somebody to try to argue in what way we are making it more complicated by removing a rule. Out of thousands of Unicode characters, the current policy asks us to forbid exactly four. Further, the policy necessitates a passage in the Manual of Style that explains people how to turn smart quotes off in their external editor, and Knowledge Seeker above relates how the current policy makes his life harder. So I can produce real evidence for the fact that straight quotes make life harder. On the other hand, I still haven't seen an argument that allowing curly quotes should somehow complicate matters. If you don't care, then leave them be. What could be simpler? Arbor 22:00, 13 Jun 2005 (UTC)
It's just much simpler to just enter a straight quote than the curly ones. (One key press for a single quote; shift + key press for a double quote.) Also, I am unbothered with modifying text I bring in from other sources that include the curly quote... in fact, the vast majority of content I've added has been from my own head, so quotes translation hasn't been much of an issue. Last, if the policy is lifted, then it in effect becomes an "open wound that cannot be closed" between those of us who want the simpler straight quotes and those who want the curly quotes. With a policy, a decision has been made that we can all point to. At any rate, I think the key difference between quotes and other potential uses of unicode characters is quite straightforward: quotes are far more commonly used than just about anything else (I mean, how often does one run across the use of non-quote-related unicode characters in articles? not very often in my experience), and this is coupled with the unusual aspect of the straight/curly simple/less simple dualism. — Stevie is the man! Talk | Work 23:01, Jun 13, 2005 (UTC)
As to entering quotes, that very much depends on your keyboard. I am on a Mac, and my curly quotes are just as easy as straight quotes. ALT-[ gives me (and millions of others) an opening “, ALT-] gives me (and millions of others) an opening ‘. Use SHIFT to get the corresponding closing quotes. It's been that way for decades. And if that's too difficult: just use straight quotes. No added complexity. And (to repeat myself): if you use an external editor (I don't) you might need to actively prevent it from using curlies. To answer your question about the ubiquity of Unicode characters: How often? Pretty much on each and every page, in the lower left corner. (Japanese interwiki links.) Arbor 07:59, 14 Jun 2005 (UTC)
Another thought: if the use of another Unicode character "got out of control," and the use of something simpler was available, I would expect a policy to be developed around that as well. Note that I've argued against &mdash;, &ndash;, etc. before in that double-hyphens should be automatically converted to em-dashes and that hyphens should be used in place of en-dashes, so we could remove &mdash;'s and &ndash's from the source. I want consistency actually, but I also want the grammatical context to be accurate. And thus, for now I support including &mdash's in source. But the ideal (in my view) is to keep the source as simple as possible. Note: I have no issue with straight quotes being converted to curly quotes upon some sort of official publication of the Wikipedia—perhaps this sounds like a good compromise position to go after? — Stevie is the man! Talk | Work 23:16, Jun 13, 2005 (UTC)
Well, I would love that, of course. I would much prefer to have a only proper quotes policy. But I can't see how there can ever reach consensus about it. So I suggest to handle this like we handled all other similar issues: by not having a normative policy about it. (See, for example Wikipedia talk: Manual of Style (dashes) for the exact same discussion that we are having here.) You can see on this debate that there are people who would love curlies, and others who are very much opposed to them. So the only compromise position I can see is to allow both. By the way, I like your argument about simplifying the source, but all the concerns about character entities will go away when we switch to UTF-8, when “&emdash;” will be encoded as just “—”. Arbor 07:59, 14 Jun 2005 (UTC)
Links (web browser) for OSX breaks things in a number of interesting ways: [2]
  1. Quotes are converted from curly quotes to straight quotes
  2. Cyrillic is transliterated into ASCII
  3. Accented Latin-1 is decomposed into a letter and an accent mark
  4. Hebrew and Japanese are "transliterated" using some unknown method
  5. Chinese, Korean, "HI" (whatever that is), and "TH" (Klingon?) are converted to asterisks.
Carnildo 19:08, 6 Jun 2005 (UTC)
Mozilla 1.7.3 for OSX does not appear to break things with a simple edit-and-save, but it does not display "HI" or "TH" correctly. I suspect with a little work I could get it to break those.
Safari (web browser) for OSX 10.3 works properly.
Carnildo 19:16, 6 Jun 2005 (UTC)
Carnildo, thank you for your diligence. Links seems to be having according to spec; it does exactly what you would expect on Unicode. Not broken at all except for the Far East scripts. (You are discussing font issues, not coding systems now, by the way.) Suffice to say that quotes work as expected. Before you continue your research, please note the reference that was already posted. [3] It's the German Wikipedia's analysis of the browser support for Unicode, before they switched. Unicode support seems pretty universal. And again: we aren't discussing whether the switch to unicode is good or feasible. Almost every browser in use, and certainly among those in use to view Wikipedia with all its fancy CSS is going to display it just fine. Especially the curly quotes, which are trivial font-wise. Millions of Wikipedia articles already exist in Unicode, and are read by people all over the planet every day. Trust me, it's going to work even in the Anglosphere. That is why I boldly suspect that the technical issues mentioned on this page for viewing curly quotes are fictitious. (But in order to take it seriously I made a test page with such quotes; everybody can point their standard browser to that page.) If people's browsers for some reason are stuck in Isolatin-1 or some Windows codepage (I don't know why you would force that, but some powerusers may have done so) then all these users will switch encoding to "Unicode" or "default/automatic" (thereby leaving the content negotiation protocol to figure this out) on Day One of viewing Wikipedia 1.5, because most of their pages will be garbled. (As would most of the Internet already, a good part of which uses Unicode.) Arbor 20:01, 6 Jun 2005 (UTC)

We can move to curly quotes and other Unicode-Characters if we block Internet Explorer from editing.[edit]

I run a wiki myself. It uses many languages. I found that the WinTelLusers and brain-dead me-too AOLers kept corrupting everything:

“And postin’ ‘¡Me too!’ like some brain-dead AOL-er”
“I should do the world a favor and cap you like Old Yeller”
“You’re just about as useless as jpegs to Helen Keller”

Weird Al Yankovic

My first reaction was to ban all WinTels, but the WinTelLusers bitched and moaned. I did ban all brain-dead AOLers (¿Do we truly want idiots so stupid as to pay three times the going rate for dialup just because they received a disk in the mail to edit?), but had to keep the WinTelLusers. I just banned all versions of Internet Explorer from editing. I point people trying to use Internet Explorer to Mozilla FireFox. I also made it a bannable offense to corrupt data. Basically, we can move to UTF-8 if we block all edits from Internet Explorer.

If we shall adopt curly quotes, ¿should we block edits from Internet Explorer for preventing corruption?:


  1. — Ŭalabio 05:20, 2005 Jun 18 (UTC)



— Ŭalabio 05:20, 2005 Jun 18 (UTC)

This is absurd[edit]

Walabio, please set up your own poll about this, on another page. It has nothing to do with what I would like to discuss here, and is also pretty offensive to a large community of WP editors. This page is not to discuss the transition to Unicode. Most importantly, there is no reason the assume that Internet Explorer should have a harder time with curly quotes or other Unicode characters than other browsers. There are hundreds of thousands of Wikipedia articles in Unicode (French, German, Japanese), and they work just fine. In these countries people didn't just throw their Wintel boxes in the trash when they started to use Unicode. All you are doing with your poll is to perpetuate the meme that there should be problems involved in adopting curlies. That is unfounded. You are spreading FUD . Arbor 07:04, 18 Jun 2005 (UTC)

Actually, MSIE/Win is pretty brain-dead about choosing a font that actually contains a particular character. We had to use SAMPA to represent pronunciation on Wikipedia for the longest time. Since template:IPA was invented to work around MSIE's deficiency for many users, the SAMPA disappeared instantly and no-one has complained. Some early Cyrillic alphabet characters still won't display on that browser, without a lot of reconfiguration work on the part of the user. And I think Unicode Plane 1 characters require registry editing to display at all.
But I agree with your sentiment: this proposal belongs on the same page as Wikipedia:Standardize spellings. Michael Z. 2005-07-6 20:40 Z
Yes, Windows works fine with all the UTF-8 encoded accented characters used to write European languages, and with an appropriate IME it does fine with non-latin scripts. BUT, the punctuation characters are different, because most Windows apps predate UTF-8 and for reasons of backward compatibility with documents, they fall back to the Windows-1252 Code page, where they occupy codes in the range which is illegal in all modern encodings (because they become cotrol codes if converted to 7-bit). So the only way this is going to work is if all curly quotes are implemented as numeric entities. The editor can be tweaked to convert illegal codes pasted from MS-word etc to the correct entities. However, the entities may well confuse people who are editing the article. dramatic 7 July 2005 10:05 (UTC)
Explain. You seem to be pointing to a potentially interesting scenario. So far I have discounted all such appeal to hypothetical situations as fictitious. Can you give me a concrete example of an editing situation where the curlies would fail but other UTF-8 would work? I tend to be sceptical because there are millions of Wikipedia pages that use proper quotation marks (namely the German, French, etc. versions) and I have encountered no such reports, even though Windows-based setups are just as common there as in the anglosphere. Arbor 7 July 2005 10:19 (UTC)


Someone just write a user.js file that replaces "([^\"]+?)" with “$1” and people can use it if they want :-) - Omegatron 14:22, August 8, 2005 (UTC)

See User:Chocolateboy/smart_quotes.user.js. For the inverse, see Dumb Quotes :-)
chocolateboy 16:55, 6 December 2005 (UTC)


Wikipedia content can reasonably be expected to be valuable for decades to come (cf. FOLDOC and 1911 Encyclopædia Britannica). MediaWiki, Windows, Linux, IE, and Firefox are all relatively fleeting technical concerns. There is no reason that the content here won’t be useful in 2105, but if people are still using Microsoft Windows, I’ll be rolling in my grave. As such, I would argue for unapologetic whole-hog correctness for the content. I.e., I vote for Wikipedia to officially encourage copyeditors to change androgynous ASCII puntuation glyphs to their uncomprising Unicode counterparts. In terms of man-hours, Wikipedia content is already four or more orders of magnitude more valuable than the MediaWiki software and the Wikipedia hardware. Thus, if legacy content consumers can’t handle the truth in the short term, MediaWiki software can be hacked to present them a dumbed-down view. —Fleminra 23:19, August 8, 2005 (UTC)

Um...what? If content is king, you seem more concerned with typographic style over substance. I like the lowest-common denominator approach to the text itself. A designer and typography enthusiast, I still don't like text to use anything but plain ol' straight quotes until it is published in print with specific typographic requirements. Designers and editors have to tell authors all the time not to use special characters in their word processors because we might be taking the text across platforms, into various publishing programs, and it can cause problems. Simple character encoding, using ASCII glyphs, is still a great bet. I figure the rendering software can be told to display quotes as curly quotes if that's what someone wants. DavidH 20:35, 21 December 2005 (UTC)
Substance and typographical style are orthoginal, so naturally I don't have a preference for one over the other. Let me try reductio ad absurdum with your reasoning: Some computer terminals can only handle upper case. Therefore, let's encode our master copy using only upper case (least common denominator), until it is published on a lowercase-capable medium. Of course in this case, obviously it would be better to store the data using (correct) mixed-case letters, and let the server present an ad hoc upper-case view for the legacy terminal's consumption. To me, the essence of the problem is that mixed-case text contains more information than single-case text; likewise, text containing so-called typographical quotes contains more information than text containing straight quotes. In neither case does there exist a robust process to automatically derive the information-rich version from the information-poor version ¹ -- but the inverse processes do exist. So, if manual conversion of straight quotes to typographical quotes is necessarily a manual process, and it strictly adds value, why should willing volunteers be disallowed from doing so? Regards, Fleminra 23:55, 21 December 2005 (UTC).