Begin archive The example about the pronunciation of the word "one" in different dialects looks like it's missing something, unless/until you realize that the thing at the end that looks like an em dash is actually a Chinese character. It would be clearer if it was changed to use some word with a more distinctive Chinese representation.

NPOV and the problem of nomenclature

It seems there is disagreement about (1) what to call the different dialects/languages that are referred to as "Chinese"(中文), and resultingly (2) whether or not this article should be organized like one of the language pages or one of the language family pages.

Languages(语言), as commonly accepted by linguists, are separated by mutual spoken intelligibility. From the point of view of linguists, Mandarin, Cantonese, etc. are separate languages in the same group. For example, the largest and most respective language inventory, the Ethnologue, classifies "Chinese" as a language family, with all the so-called dialects as separate languages in that family. See [1]

Actually, it's not clear that they do that at all. It seems to me that Ethnologue tries as much as possible to just duck the question. They have separate entries for different types of Chinese, but they also have different entries for different variations of Yiddish. Roadrunner 06:55, 22 Apr 2005 (UTC)
Also, you might want to read the introduction to Ethnologue where they address this issue. Roadrunner 12:45, 22 Apr 2005 (UTC)

Linguists reject the notion that the nature of written language is a defining factor in what is and isn't a separate language. To linguists, Serbo-Croatian is one language, not Serbian and Croatian as separate languages because they use separate writing systems. Conversely, Mandarin and Cantonese are considered to be separate languages, even though they have a mostly mutually intelligible writing system.

Citations, please? My source (The Cambridge Encyclopedia of Language) and Nathan's the Languages of China seem to suggest the opposite. Roadrunner 06:55, 22 Apr 2005 (UTC)

I believe the argument that the Chinese languages can be classified under a consistent linguistic system as dialects of one language has very little merit.

Nevertheless, the fact remains that there is a strong tradition of classifying it that way. Not only is the common conception among Chinese people that all the varieties are part of one language, but there is a tradition in at least non-academic colloquial English usage to describe them as dialects.

The question comes to how should they varieties of Chinese be classified on Wikipedia. While certainly both perspectives deserve to be described in context, only one can be the canonical name of the article.

The editors of the Ethnologue have a tradition of scientific objectivity and neutrality in their treatment of language classification, which has resulted in their work being highly respected by linguists. I would argue that if the Wikipedia is to deviate from the classification scheme used by the Ethnologue, not only should we have a strong rationale for doing so, we should be upfront about it.

The writers of Ethnologue themselves point out that there is no objective, rational way of classifying languages in the introduction.

It seems inappropriate for popular conceptions that are not accepted by scholars, no matter how common they might be, to be a driving force in article content. I therefore advocate that this page should be titled "Chinese languages" and should be written from the perspective that the different varieties of Chinese are different languages, while admitting to the fact that this classification system is different from the one in common use.

[P0M]: I agree. Thank you for finding the authoritative citations. I'm not sure whether the accepted term for Latin derivitives is "romance language" or "romance languages". It strikes me that most U.S. Universities have a "Romance Language" department. Lucky for them they don't have to respect standards of English syntax. ;-) How about naming the page "Family of Chinese Languages"? Anyway, just because a misconception got started decades and decades ago does not mean that we should perpetuate it.

[P0M]: A certain individual from the S.W. of the U.S. went to the far north in the company of a Canadian. They saw a distant figure, apparently dressed in a fur coat and leggings. The Canadian said, "Tisaninnuitinit!" To which the U.S. person said, "Huh?" The Canadian said, "Tisaninnuitinit!!!" To me, that is a dialect difference.

[Peak:] The idea that there is a single Chinese language seems to have strong emotional appeal to many people, and I suspect that if the article were renamed to "Chinese languages" or "Family of Chinese languages" (or anything with a plural), it would become a hot issue very quickly. The outcome could well be a new article at Chinese language, even if Chinese language were made into a #REDIRECT.

[Peak:] One of the root difficulties here may be that the English word "language" can be translated into written Chinese in several ways, and the idea of there being multiple "Zhong wen" is simply nonsense.

[Peak:] One alternative would be to change the article to [Chinese language family]; perhaps a better one would be to have separate articles on:

  • [Chinese written language] (which is currently a #REDIRECT to [Chinese language])
  • [Chinese spoken languages]

In any case, I think it is far more important to ensure the accuracy of the contents of the article(s). How are we doing in that regard? I think that by using the word "variants" the article does a reasonable job, and I would rather keep "variants" than getting involved in edit wars over words like "dialects". Peak 06:27, 5 Feb 2004 (UTC)

I'm not entirely sure why this talk page has a different style for indicating who said was than the rest of Wikipedia, but that's not important. I won't mind if someone changes my remarks to conform to the style of this page, but I won't. I'm pleased that everyone seems to agree that the status quo of the pages on Chinese is not satisfactory.

After my initial post yesterday, I re-read the article and have changed my feelings somewhat. I think the use of "language variations" is a good way to avoid making a stand on languages vs. dialects. However, I think that the parts of the article that imply that there is a single "Chinese language" should be changed to reflect the English definition of the word "language", one based on mutual intelligibility. In fact, multiples articles, as you suggest, is probably the best way to go about it. Here is my proposal:

The final question involves Wikipedia:WikiProject languages. There is a template at Wikipedia:WikProject language template. Right now, we have just been applying that template to languages that are listed in the Ethnologue, as most of the statistics for the table can be gleaned from the entries in the Ethnologue. However, the Ethnologue has listings for each of the "dialects" of Chinese as separate languages, so the question is: which pages should follow the WikiProject languages template? --Nohat 21:26, 2004 Feb 5 (UTC)

This page is already linked to too many pages to be a disambiguation page. I oppose making it so. If there is a single "Chinese language", then it is the written language - discussion of the written language can go here. I think moving [Chinese dialect] to [Chinese languages] might confuse that article with the one. Either have it at [Chinese spoken language] or [Chinese spoken languages].--Jiang 21:58, 5 Feb 2004 (UTC)

[P0M] Hi, Jiang? or was it Nohat???. You can blame me for the different way of indicating who said what. It got started on another talk page where people started chopping up postings to insert their comments into the middle. Pretty soon it became impossible to determine who had said what. So I started putting my initials at the beginning of each paragraph I wrote. (Then if somebody chops into the middle of it I can fairly easily go back in and tag the bits and pieces.)
[P0M] I would suggest that we follow the way that the Chinese linguists identify and segregate things except that it seems that there are at least two competing schemes. One problem with something like "Chinese Languages" is that it sounds identical in meaning to "Languages of China," and the latter can reasonably be interpreted to mean any language spoken by any minority group that has established a presence within the political entity called China. The "Chinese Language Family" seems to avoid the firestorm that "The Family of Chinese Languages" would evoke, while suggesting both the underlying unity and the actual diversity.
there's a subtle distincition between "Chinese languages" and "Languages of China". Perhaps the latter could be moved. The former is separable from the geopolitical entity. --Jiang
[Nohat]I moved Chinese dialects to Chinese languages, but Jiang moved it to Chinese spoken language. That's fine for now, as the coverage there is fairly neutral on the language/dialect issue; however I don't think the name is ideal because it doesn't correspond to the standard used for other language families ([[XXX languages]]), and it uses the singular "language".
Obviously, I dont have a background in linuistics like you, so forgive my ignorace. But is this not a conventional language family? Articles on the other language families treat the languages as a whole. This article is only talking about the spoken language. People would be expecting to find info about written Chinese in "Chinese languages".--Jiang
[Nohat]Furthermore, this page remains troublesome. It is still written from the POV that all of Chinese is somehow a single language. I think that if people come to a page called "Chinese language", then the first thing they should see is a discussion of how the very idea of a single "Chinese language" is a contentious issue. Jiang says "If there is a single "Chinese language", then it is the written language", which is fine as an logical implication, but I disagree with the premise. I don't think there is such a thing as a single "Chinese language" (whereas I think Jiang, and probably many (other?) Chinese people disagree), and the page titled Chinese language should handle this disagreement in a neutral way. This is why I proposed before that this page be a short page, containing just a discussion of the disagreement about whether or not there is a such a thing as a single Chinese language, with links to other pages, which I guess could be called Chinese written language and Chinese spoken language, although I don't think those names are ideal.
I don't see why this page shouldn't discuss the controversy and the written language. There's no need to split unless there isn't enough room.--Jiang

There is a good word which refers to the Chinese language as a family of languages, and it is Sinitic. If the page is to discuss only spoken languages of the Sinitic family of Chinese languages, then Sinitic seems to be a good choice IMHO.

This article definetly needs a disambiguation page. Cantonese is NOT a dialect of Chinese--it is a separate language altogether, albeit in the spoken sense only. Dialects are defined as variations on a language wherein the speakers are mutually intelligible to each other. Being from Beijing, and speaking the Beijing dialect (Beijing is not standard chinese either--they tend to add an "r" sound onto the end of words, sort of like how Bostonians drop "r" sounds) I can safely say for most speakers of Mandarin Chinese that we cannot understand speakers in the Guangdong province at all. It basically sounds like Vietnamese to us. For this reason, I believe that Cantonese and Mandarin should be on separate pages.

I also like the idea of pages for specific dialects and sound samples of dialects. I've always wanted to do that to show the diversity in language of China, but because I live in the United States that has been difficult. And just to clear up some other things, speakers a little bit south of the Yangtze river can usually understand pretty well what people in the north are saying. However, once to get below the Zhejiang province or the Guizhou province things begin to get difficult. For instance, a speaker of the Sichuan dialect (very individual and easy to recognize) can usually understand a speaker of the Beijing dialect as easily as a New Yorker might understand a New Orleans native or as easily as a Londoner can understand someone from Birmingham etcetera. Just to clear some things up.

We already have separate pages: Mandarin (linguistics) and Cantonese (linguistics)
This page belongs to clarify the relationship among the different chinese languages and to discuss the common written language. --Jiang 03:20, 29 Apr 2004 (UTC)
Jimmy Jin is correct about Cantonese not being a "dialect" of Chinese. There are probably several places where "dialect" has been misused in this way. I'll try to watch out for these places and make them consistent. P0M 15:16, 29 Apr 2004 (UTC)
I disagree. "Dialect" is not defined in this way among Chinese linguists and laymen. In general, many Chinese people define "language" in ethnic and cultural terms rather than purely linguistic terms, and consider Chinese to be a single language regardless of intelligibility. Also, Western linguists do not use "language" or "dialect" consistently either: Danish and Swedish are called different "languages" although they are mutually intelligible. And there are other examples: Croatian, Macedonian, Belorusian etc. that are loaded with political and ethnic overtones. Take a look at dialect -- it goes into that in detail.
This article was written initially to be as NPOV as possible, and gives equal time to both conflicting views, as well as explain their origins, motivations, and implications. I think it would be more helpful if we used neutral terms, like "subdivisions", and kept "language" and "dialect" only in contexts where they are explicitly explained as subjective terms.
As for the organization of articles: as Jiang said, this is basically a page to clarify what Chinese is (or rather, what people think it is). There are also separate articles for Mandarin, Cantonese, etc.
What does Indonesia have to w/ Mandarin (Majority of Chinese people in Indonesia don't speak Mandarin as I observed)? W3bu53r

I don't understand your objection, perhaps because I don't know where you "observed" that most Chinese people in Indonesia don't speak Mandarin. All the article says is that Indonesia is one place where Mandarin is "spoken." Maybe the article could be improved by discovering what percentage of the population consists of Mandarin speakers in the various nations of the world and then finding a cut-off point below which names wouldn't make it to our list. Outside of China, Malaysia has to have a relatively large concentration of people who can at least speak Mandarin as a second language. Thailand also has a fairly high percentage of people who can speak Mandarin even though most people who speak Mandarin there seem to have Chaozhou hua as their first language. But I don't know what the statistics are. In pure numbers, not percentage of population, Vancouver, NYC, etc. have to be pretty high on a list of places with high concentrations but there may be higher absolute numbers in Indonesia than in, e.g., Canada. P0M 14:44, 15 Sep 2004 (UTC)
Well, I'm sure the percentage would be very small, if you recall back in 1960s when Suharto banned almost everything related to Chinese culture in Indonesia, you'll know what I mean. Indonesia may posses bigger Chinese population than Canada, but the percentage of Mandarin speaker among Chinese Indonesian are relatively small. W3bu53r

The line at the top

I've added a line at the top. It looks kinda out of place right now but let me explain why I've done this.

It's because today, I was wondering, how navigable exactly are the Wikipedia pages on Chinese language, Mandarin (linguistics), Standard Mandarin, Pinyin, etc? So I assumed that I were a clueless visitor who is, say, trying to find out how to pronounce "Chongqing", "Guangzhou", "Xiamen", and, basically, any Chinese name. Now being the clueless visitor that I am, I typed "Chinese language" into the search bar, and came here, only to be blasted in the face with a wealth of abstruse information that I don't need, and worse, doesn't get me to where I want to go. Then I assumed that I was incredibly insightful (or lucky), and somehow managed to click on Mandarin (linguistics); then I scrolled down to the "phonology" section, and followed the link to standard Mandarin; and then from there there was a link to pinyin which would explain how to pronounce "Chongqing", but I would have probably missed it anyways, the hapless wanderer that I am, because I wouldn't know what "pinyin" is, or that "Chongqing" is written in "pinyin", or that pinyin would tell me how to pronounce things.

Typing "Chinese pronunciation" into the search bar and then clicking "search" instead of "go" didn't help much either, in case you're wondering.

One sentence in the current version of this article does not make good sense: "Chinese (under Westerners) make a sharp distinction between Written language (wen/文) and Spoken language (yu/語)." It sounds like the sentence means "Chinese people under the rule of Westerners make a sharp distinction..." I'm pretty sure that meaning wasn't intended. My guess is that the sentences is intended to convey the idea that under the influence of Western concepts of language, Chinese scholars make a sharp distinction..."

I find it hard to believe that the generations of Chinese who wrote and read wen2 yan2 wen2 were unaware that there are significant differences between the ways meanings are represented in those two modes of communication. I remember one of the teachers at the Inter-University Center (on the campus of the National Taiwan University) remarked with obvious awe that "Zhou1 Long2 just reads wen2 yan2 wen2 without translating it into bai2 hua4!" It took Hu2 Shi4 and others of his time a great deal of time and energy to convince others that it was intellectually proper to write in vernacular Chinese. (His vanguard article on this subject was written in wen2 yan2 wen2.)

Hu2 Shi4 is one of the few people who have actually succeeded in writing true vernacular Chinese. Most authors have opted for (1) writing a simple form of wen2 yan2 wen2 (e.g., Qian2 Mu4), writing a combination of literary and vernacular Chinese (ban4 bai2 ban4 wen2) (e.g., Yin1 Hai3 Guang1), or (in the case of most current authors) writing a form of vernacular Chinese that uses far fewer characters than would be needed to "write Chinese as it is spoken", which was Hu Shi's original objective. In the last decade or two, teachers of CSL classes have recognized the need to teach non-Chinese students to read this sparse written form and have called it "shu1 mian4 yu3." The 1920s dictionary prepared by the Ministry of Education has an entry for "shu1 mian4," (refers to uses written means to convey [meaning]," but no entry for "shu1 mian4 yu3."

I question whether there was any need for a "Western influence" to convince Chinese educators that their elementary- and secondary-school students need to be prepared to read and write "shu1 mian4 yu3." A quick look at the vocabulary taught in elementary school textbooks produced at least as early as the 60s will show that the Ministry of Education in Taiwan (and surely on the mainland as well) was intent on teaching students the vocabulary that they did not already use in everyday speech -- with an eye to their being able to read the Chinese equivalents of the New York Times and college textbooks. P0M 11:57, 6 Jan 2005 (UTC)

This is definitely a typo, and the intended word was almost certainly "unlike" rather than "under". (After all, the entire paragraph before talks about the differences between Europe and China.) -- ran (talk) 14:02, Jan 6, 2005 (UTC)

Written Chinese

Why does the Written Chinese section basically repeat what is written in the Chinese written language entry? It already has a link to it near the heading. I suggest that the duplicate content be removed and be replaced with a shorter summary, otherwise we may end up having divergent content on the the two entries. If users want more information, they can click on the link. What do people think?

it doesnt look that long to me. there's room for expansion at Chinese written language. The lead section here is horrendous though. --Jiang 11:59, 6 Feb 2005 (UTC)
Hmm... I guess it is not too bad, but I think it might be better to remove the list near the beginning and just have the section start with "The relationship between..." I don't think the list adds much to the section itself. I agree that the lead at Chinese written language is pretty bad. I might make some improvements upon it. --Umofomia 12:32, 6 Feb 2005 (UTC)
French and Spanish as examples: inadequate

This diversity in spoken forms and commonality in written form has created a linguistic context that is very different from that of Europe. For example, in Europe, the language of a nation-state was usually standardized to be similar to that of the capital, making it easy, for example, to classify a language as French or Spanish. This had the effect of sharpening linguistic differences. A farmer on one side of the border would start to model his speech after Paris while a farmer on the other side would model his speech after Madrid. Moreover, the written language would be modelled after the dialect of the capital, and the use of local speech or mixtures of local speech would be considered substandard and erroneous. In China, this standardization did not occur.

I consider the example of the French-Spanish border inadequate.

  • Firstly, because it was not until way after "languages commonly deemed as different from Latin" emerged that France and Spain shared any border whatsoever. This happened at the middle/end of the Middle Ages and then, on the southern side the Kingdom of Navarre and Aragon bordered other independent Lordships/semi-states/whatever in the North (actually, Aragon, or to be more precise, the Catalan area of the Kingdom of Aragon, encompassed both sides of the Pyrinees, and I guess the same could be said of Navarre). Then, the languages spoken wore Basque on the Western end, Occitan in the North and Catalan and Aragonese in the South of what nowadays would be the border.
  • Secondly, because the farmers in the area did maintain their own languages (that is, neither French or Spanish) until quite recently (way after France and Spain began to exist with their current borders). In fact, although Aragonese did follow the aforementioned trend and became blurred as Spanish and Occitan was nearly wiped out (post- French Revolution), Basques and Catalan continue to (stubbornly speak) their own languages.

I must admit, though, that the paragraph at hand did clarify the idea to me (a European). And the fact that local speeches and standards were seen as wrong is undeniable: schools prohibiting patois in France, etc. There's only the small fact that I think the example is wrong because languages other than Spanish and French existed already, not merely dialects, thus what happened was not a rupture of a dialectual continuum but the imposition of "foreing" languages altogehter. However, I give in that, that as a Catalan I might be biased when dealing on the subject at hand. I would suggest changing the "frontier" between the Romance dialects becoming languages to that of Spain and Portugal, where (as far as I know) no other language was "crushed" in between (except for its norhtern part, where Galician stands. The frontier between Italy and France would be equally wrong because of Franco-Provençal and Occitan. Maybe some other bordiers concerning other family groups would also apply, I don't know (Dutch-German, Bulgarian-Serbian and, nowadays, Serbian-Croatian...). If no objections are made, I'll chage it in a few days. -- 00:02, 15 Jun 2005 (UTC)(Jahecaigut)

I think the paragraph quoted from the article may be too idealized, but it serves to point out how the existence of nation-states in Europe affected the definitions and the standards that apply to languages. I'm told that Swedish and Norwegian are not greatly different even though they are affirmed by the national governments involved to be different languages. The central government in France is, I believe, pretty active in defending a standardized form of that language. The natural course of languages in Europe would have been for a continuum to form everywhere that Latin had been the standard language. That would have left out Basque, Hungarian, German, etc., etc. even in places where they existed side by side with the Romance language(s). The Romance languages would have diverged, but without the imposition of "centripital" forces at the political centers of the nation states to create sharper linguistic boundaries than would otherwise have existed.

It would be interesting to know what happened to languages during the Three Kingdoms period. Usually China has not been divided into separate national entities, so the pressure to conform to the standards of competing nations has not been present. (Guess why Winston Churchill could speak American English as well as his mother but used English English in everyday life.) What is a bit hard to understand is why, in the past, getting Chinese in the provinces to speak the language of the national capital was not easy. P0M 00:51, 15 Jun 2005 (UTC)

  • To answer the above's question. Today's Chinese dialect boundaries actually correspond VERY WELL with political boundaries during the Five Dynasties and Ten Kingdoms era over a thousand years ago. Wuyue Kingdom 吴越国's boundaries is almost exactly today's Wu dialects; Wu Kingdom 吴国 and Southern Tang Kingdom 南唐国 is today's Jianghuai Mandarin and Gan dialects; Latter Jin Dynasty 后晋代 is today's Jin dialect; Min Kingdom 闽国 is today's Min dialects; Later Tang Dyansty 后唐代 is today's Northern Mandarin dialects; Chu Kingdom 楚国 is today's Xiang dialect; Southern Han Kingdom 南汉国 is today's Cantonese.

See map of the Five Dynasties and Ten Kingdoms:


Dialect groups today:


Madrid/Paris European languages analogy

The article states:

For example, in Europe, the language of a nation-state was usually standardized to be similar to that of the capital, making it easy, for example, to classify a language as French or Spanish. This had the effect of sharpening linguistic differences. A farmer on one side of the border would start to model his speech after Paris while a farmer on the other side would model his speech after Madrid.

I understand what we're trying to say, but I think this analogy demonstrates better that its author had a simplistic understanding of European linguistic variation than its intended meaning. It wouldn't hold well in the best of cases, but the choice of Paris/Madrid is particularly problematic.

First of all, many people on the French/Spanish border speak Basque, which is completely unrelated to either French or Spanish (or indeed, any European language). Leaving the Basque speakers alone for a moment, roughly half of Spain speaks Castillian (the language of Madrid), with the other half speaking Catalan (the language of Barcelona). These two are certainly not the same language, as anyone who speaks one but not the other will cheerfully tell you.

Furthermore, up until very recently, pretty much all of southern France spoke Occitan, la langue d'Oc, while Parisians and other northerners spoke la langue d'Oil, which is what is typically called French. Occitan is still widely spoken but linguists agree that the language is dying, but not because of political standardization -- because of media.

In Pas-de-Calais, and much of the north, old people still speak Chti, which while clearly a dialect of Oil is pretty much unitelligible to non-locals. Then there's Breton, spoken in Bretagne, which is actually similar enough to Welsh that Welsh speakers can understand it. Needless to say, French people cannot...

The truth is, linguistic homogeneity in Europe (and indeed, anywhere) is very much a recent phenomenon, tied to urbanization, standardized education, and most recently, radio and television. It has nothing whatever to do with where the capital is: any German speaker will tell you that Germans from Berlin have an extremely strange accent, and will point to Hannover/Bremen if asked for an example of where standard German is spoken -- and even there, Plattdeutsch (Low German) is spoken by older locals.

My point here is that this attractive analogy vis-a-vis Europe is wrong. While 20-somethings with urbanized grandparents may not believe it, as recently as the second world war pretty much every town in Europe had, at the very least, its own dialect. Much like China today.

China, like its European counterparts, is experiencing the slow death of its local languages, as standard Beijing Putonghua is introduced in schools and on TV. And even standard Beijing putonghua, not surprisingly, isn't standard -- Beijinghua can be so heavily retroflexed that non-locals fail to understand it, and that's just phonologically. Grammatically, it's different too. But even in the capital, young people are speaking more and more standard Mandarin, and less and less of the patois spoken by their parents. This is the cummulative effect of standardized education and the exclusive use of standard Mandarin on TV and on the radio.

I currently reside in Shanghai, and while I do not speak Shanghainese, I am witness to its dwindling importance. The influx of waidiren has forced the locals to increase their use of Mandarin, to the point where, despite the fact that I cannot speak the language, I can hear the difference between the way young people speak (excessively Mandarin-ified) and the way old people speak (more distinctive nasals, less isolating tonality, and different vocabulary). While the Shanghainese, who are very proud, do their best to preserve this aspect of their culture, most will admit that children born in the city often do not speak the language well, if at all. Consider: they are instructed in Mandarin at school, they hear Mandarin on TV and on the radio -- Shanghainese is China's answer to Occitan.

Yue and Min (or at least, Minnan) will likely fare better, as Hong Kong and Taiwan, respectively, seem to value their continued use. Interestingly, Wu (of which Shanghainese is a dialect) is the second most spoken Chinese language, after Mandarin -- but its lack of representation in overseas Chinese communities sees it, at least in my estimation, eventually succumbing to the convenience of a Mandarin speaking world.

Getting around to the point, I think we should change this paragraph. Thoughts?

Thanks for all of this information! ;) And yes, we should change the paragraph in this case. -- ran (talk) 20:09, Feb 20, 2005 (UTC)

The sentence "China ... maintained a common written language throughout its entire history, despite the fact that its actual diversity in spoken language has always been comparable to Europe" is confusing. Is it talking about Chinese or about the languages of China? Chinese maintained a common written language; but China did not (consider, e.g. Manchu). Chinese is not as diverse as the languages of Europe (consider French, Hungarian, and Russian); but the languages of China are. --Macrakis 02:59, 13 May 2005 (UTC)

Link to wikipedia?

I can't see a link to the Chinese language wikipedia. Most languages have one; maybe m.e. 10:34, 16 Feb 2005 (UTC)?

mountains are not the main reason

Mountains are not the main reason for greater linguistic diversity in South China. Rather, it is because that is where the language family originated. It developed for thousands of years there before spreading to North China. A language family tends to have greater diversity in an area where it has been longer, and less diversity where it has spread more recently. --Erauch 05:16, 20 Feb 2005 (UTC)

Chinese languages originated in North China. The language of the oracle bones, Zhou Dynasty poetry, Confucius, etc. is clearly an ancestor of modern Chinese languages. Chinese languages then spread southwards, displacing whatever aboriginal languages were spoken in southern China (which may have been Tibeto-Burman, Austronesian, Tai-Kadai, Mon-Khmer, etc. or all of these) and forming the southern Chinese languages / dialects.
It is generally presumed that North China has less linguistic diversity because of frequent population movements, caused by war and turmoil and facilitated by a relatively flat topography. If none of these had happened, then the North Chinese dialects would certainly be much more diverse than the South Chinese dialects. -- ran (talk) 20:14, Feb 20, 2005 (UTC)

Where are you getting your information from? It is agreed upon in every Chinese history book I've seen that Chinese civilization, which includes the writing and language, started in northern China. It was not until the Tang Dynasty that the south was fully sinicized. This is actually why southerners typically refer to themselves as Tang people (唐人) rather than Han (漢人). If you have sources for your claim, I would be interested in seeing them.
BTW, I think you may also be mistaking diversity with divergence. The southern dialects, while more diverse, are actually more conservative with respect to the language features that are retained from Middle Chinese. Northern China, while less diverse, has diverged much more linguistically from Middle Chinese. -- Umofomia 00:05, Feb 21, 2005 (UTC)
"For instance, Wuzhou is a city that lies about 120 miles upstream from Guangzhou, the capital of the Guangdong province in the south."

Wuzhou is far away in Guangxi Provice(it's very close to the border between Guangdong and Guangxi though). It couldn't be just 120 miles away from Guangzhou. --Eternal 15:34, 19 May 2005 (UTC)

This map generator claims it is 201 km (as the crow flies), which is about 120 mi. (But doesn't Wikipedia use metric measurements normally?) --Macrakis 16:46, 19 May 2005 (UTC)

Single written language?

It seems more accurate to say that the speakers of the various Chinese languages use a written form of Mandarin as a written language, rather than to say there is a single "Chinese Written Language". The written language uses Mandarin syntax, grammar and word order, and if you try to read it as Cantonese (as people do in Hong Kong), it is perhaps intelligible but invalid, in the same way as if I were to take German text and replace all the words one-by-one with their English equivalents. And since the 19th century, there has been a true written Cantonese language, with Cantonese rules, used in poetry and literature. --Erauch 20:26, 25 Feb 2005 (UTC)

Mainland China link

I removed a link to mainland China because it occured immediately after a link the PRC. Mandarin is an official language across all of the PRC, including the two SAR. It is superfluous to say " PRC in MC ".

Instandnood reverted this edit, apparently preferring the superfluous link to his pet term. Please validate your opinion here on the talk page before you revert again, please. SchmuckyTheCat 02:52, 14 Mar 2005 (UTC)

According to the basic laws of the two special administrative regions, Chinese is one of the official languages. The basic laws haven't told whether Manadarin is an official language. Both Mandarin and Cantonese are de facto official. — Instantnood 07:01, Mar 14, 2005 (UTC)
  • i have been using your own comments in talk hk as the basis for this edit. SchmuckyTheCat 15:48, 14 Mar 2005 (UTC)
Mandarin is only de facto in Hong Kong and Macao. The basic laws state Chinese is official, without stating which spoken variant. — Instantnood 10:46, Mar 15, 2005 (UTC)

Xiang and Pinghua

Relations with Hanja and Kanji

I read from some discussions that all Han spoken variants split off from the proto spoken Chinese (Middle Chinese?) with the exception of the Min group. The pronunciation of some Hanjas (Sino-Korean vocabularies) and Kanjis sound very alike to at least one of the Han spoken variants. When did the pronunciation of Hanjas and Kanjis split off from the proto spoken Chinese? — Instantnood 11:33, Mar 19, 2005 (UTC)

The history of the borrowing of han4 zi4 by the Japanese is a little complicated. Characters were incorporated into Japanese at different times, and the pronunciations associated with them depended on the time and on the place in China that they were borrowed from. Sometimes a kind of parallel evolution makes Japanese pronunciations sound like pronunciations in regional Chinese languages that were not associated with the original transmission. Korean transmission may well be more complicated, since it could involve frequent overland borrowings and also borrowings by way of Japan. P0M 17:03, 19 Mar 2005 (UTC)
I know there could be several pronunciations for the same Kanji, but this is not the case of Hanjas. Each Hanja normally has only one pronunciation. When did the pronunciation of Hanjas split off from proto spoken Chinese? — Instantnood 20:27, Mar 20, 2005 (UTC)
I guess what you're saying is that the 'hanja' split off, is pronunciation of characters in Korea becoming different from that of 'proto spoken Chinese'. That's a hard question to answer, but from the way Korean retains certain initials from Qieyun, the split off could not be later than that. In anycase, the spoken Chinese across the whole of the early Chinese language speaking areas would have differed from place to place as regionalects, but wouldn't have been much commented on. Whether the pronunciation of hanja of people travelling from Korea to China and vice versa to learn Chinese pronunciation from the Chinese capital city isn't truley known, as they could just as well have learnt a variety of spoken Chinese closer to the Korean borders way back then. As I understand it, there are plenty of diplomatic missions from Korea to China and it would not surprise me to hear that the pronunciation of Chinese characters were changed over time to accomodate newer pronunciations. This is the case of Japanese, where there are sets of readings for certain purposes (buddhist chants being one). Dylanwhs 08:25, 3 Apr 2005 (UTC)
Thanks. But unlike Kanjis, most Hanjas do not have more than one pronunciation. — Instantnood 14:18, Apr 4, 2005 (UTC)

We need pronunciation files. Good quality ones. My Mandarin pronunciation is good enough for, say, pronouncing the tones, but I'm sure we can find native speakers with good recording facilities. How about we come up with a simple sample sentence, and then do our best to find native speakers of four or five dialects to pronounce them? Three is an absolute minimum, I think. Peter Isotalo 13:37, Mar 28, 2005 (UTC)

Ok, we have two recordings of the following sentence "我家有两成人和一个小孩". One in Standard Mandarin and one in Hangzhou dialect (Wu). Thanks to Znode for the samples.
About this sound Hangzhou dialect 
About this sound Standard Mandarin 
Znode says he has a slight problem with pronouncing "l"s, so these should be seen as tests for now. We can always rework the sentence to bypass the problem. But to me they sound very good. If someone could offer a Cantonese rendition, it'd be great. Peter Isotalo 22:49, Mar 31, 2005 (UTC)
I made a go for a Beijing rendition of the same sentence, but only to use if it everyone's ok with a non-native pronunciation or if we don't find a native speaker of Beijing dialect.
About this sound Standard Mandarin 
Peter Isotalo 23:19, Mar 31, 2005 (UTC)
It's actually "我家有两个人和一个小孩", but that is awesome Chinese for an on-own-interest non-native speaker. I am awed :). "趁" means "taking the advantage/chance of...", as in "趁人不注意, 他..." ("when people are not noticing, he..."). So it's a "2" sound, not a "4". -- Znode 01:12, 2005 Apr 1 (UTC)
For the other dialects, should they say the sentence using the same exact characters or rather say the sentence how one would normally say it? For instance, Cantonese does not typically use 和 and 小孩 in speech. Should a Cantonese recording still say 和 and 小孩 or should it substitute the colloquial words instead? --Umofomia 01:49, 1 Apr 2005 (UTC)
Thank you for the compliments, Znode. And thank you very much for the input on my other pronunciations at my user page. Did I do a 4 instead of a 2 on chéng in this one, though? (And I corrected my sentence.)
Colloquial seems better to me, Umfomia, but I think we should leave that up to the judgement of the person who will pronounce it. Seems wiser to let a native decide on that one. Peter Isotalo 03:30, Apr 1, 2005 (UTC)
I can probably make a Cantonese recording then, however it will have to wait until I get over my cold. I'll post it here after I do it. BTW, what are some good programs for making Ogg recordings (on a Windows machine)? --Umofomia 05:14, 3 Apr 2005 (UTC)
If we want to show off the beauty of the Chinese language, we must surely include a recording of Gan! This is a female native speaker of Wannianese, saying "我家le有两个大人一个小大人". Neither of us knows how "le" is written, but it's a very common nominal suffix there.
About this sound Wannianese 
Is the sound quality acceptable?
Are you sure you've got the "le" in the right place? I suspect the speaker is saying, "Wo x jia you...", and the x is either a stand-in for "men" or for "de". (The third syllable is stressed, "luh". There is a character, pronounced lu4 in Mandarin, that means "hut," a humble way of referring to one's own house, I suppose. By my guess the sentence, read in Mandarin, would be: "Wo3 de/men lu4 you3... Northern Mandarin speakers are generally comfortable with "wo3men jia1," but it seems to me that most other Mandarin speakers prefer "Wo3men de jia1." And in Taiwanese the "wo3men" appear as a single-syllable "ngwan".) If you are right, however, the sentence could be the equivalent of something like "Wo3 jia1men0 you3..." Your native speaker should be able to help sort things out by finding a context where "men" would be appropriate and where "de" would be appropriate and going from there. P0M
The sound quality seems o.k. to me, but the volume is too low for my poor old IBM portable. I'll see whether I can do something with it on my Mac unless you can just put it back in your software and crank up the sound level. Try to make it as loud as you can without having it getting overmodulated, i.e., without having the sound start to break up at the higher volume parts of the passage. P0M 17:21, 3 Apr 2005 (UTC)

It sounds quite promising, but you need to get the speaker much closer to the mic and to have her speak quite clearly. Make a few attempts and listen to it yourselves a few times (preferably through headphones).

Umfomia, try Audacity. A very handy (and very free) recording tool. Peter Isotalo 18:02, Apr 3, 2005 (UTC)

Thanks for the feedback. I'm pretty sure about the analysis. She is my wife, and I've been listening to this dialect for a long time. The "le" thing shows up all the time, always at the end of nouns — "shoe" is something like "hale", "eggplant" is something like "loksule" (I believe this is cognate to the Shanghainese word).
Except that I did make a mistake! It's "屋le" not "家le". "家" is pronounced more like "ga". Sorry about that. POM, does that affect your analysis?
Actually, all I have is some guesswork. In the current version the "le" sounds a little less emphasized to me, but maybe that is subjectivism at work. From what you say I would now suspect that it is a diminuative and/or nominalizing noun ending like the zi in hai2zi. In Taiwanese they use another sound and write it with another character, the zai3 that appears in niu2 zai3 ku4 (cowboy jeans). But the dictionary reading of that character for Fujian, according to the old Giles dictionary which is the only dictionary I know to have all this dialect stuff, give ju as the normal reading of that character. That suggests that there isn't a "real" character for that sound, and they wanted to differentiate from zi3 used in Mandarin, so they borrowed another available character. Unfortunately, Giles doesn't give a way to look up "le" as a pronunciation in some dialect. So until somebody can think of a "lu" or "nu" diminuative ending we are stuck. Since Taiwanese already uses "zai3" for this purpose, meanwhile disregarding the original pronunciation, that might be a good interim choice. P0M 20:40, 3 Apr 2005 (UTC)
(BTW, I've taken the liberty of replacing the "(?)" that I originally used with "le", in both my previous and comment and in POM's comment, thinking this might be more informative to the reader.)
I will try to fiddle with the sound software, but I'm not very familiar. The speaker's mouth was about 20 cm from the microphone, and for her, she was speaking quite clearly. I'll try to get her to do it again, but my wife is almost the quietest person I've ever met.
Pekinensis 18:36, 3 Apr 2005 (UTC)
The mic needs to be pretty much right next to her mouth, preferably resting on her chin (that's how I record my samples). To minimize the effect of aspiration and fricatives, I just put a sock over the mic (the thicker the better). Peter Isotalo 18:55, Apr 3, 2005 (UTC)
Okay, I'll try again under those conditions when she comes home, but I might not get a sample if she questions why I have time for this when I didn't have time to go buy vegetables today. For the moment, I have amplified it a bit, and eliminated the click. I tried to use the noise-reduction function, but the results were poor.
Pekinensis 19:01, 3 Apr 2005 (UTC)

Okay, I made a Cantonese sample now. The sentence spoken in it is: 我家庭有兩個大人同埋一個仔女。 我屋企有兩個大人同埋一個細路。

About this sound Cantonese 

I found that some of the tones in the Mandarin sample were a bit off, so I attempted my own rendition. Note that I'm not a native Mandarin speaker though. The sentence spoken in it is: 我家裏有兩個成人和一個小孩ㄦ。

About this sound Mandarin 

I'm still suffering from my cold, but I think the samples are still okay. I can rerecord them later if you think they need changes. Any comments are appreciated. --Umofomia 20:50, 3 Apr 2005 (UTC)

Or try:
About this sound Mandarin  金 (Kim) 04:34, 5 Apr 2005 (UTC)
Hey Umofomia isn't the colloquial form 我屋企有兩個大人同埋一個細路(仔)? — Instantnood 09:24, Apr 4, 2005 (UTC)
I had considered making a completely colloquial rendition, but I also wanted to balance it with trying to make the sentence as close as possible to the original so readers could see the cognates (such as 家裏 vs. 家庭). I also decided not to use 細路 because that seemed too colloquial (since 路 in this context isn't really the same word as the word for "road," but only borrowed for its sound). What do you think? Should I use the even more colloquial version? Or should we perhaps reconsider the sentence we are using as an example so that it better illustrates the similarity among the different varieties? --Umofomia 18:19, 4 Apr 2005 (UTC)
When I elicited my sample, I made a point of asking in English, to minimize the influence of Mandarin. — Pekinensis 18:23, 4 Apr 2005 (UTC)
Perhaps we should do it in two ways. One section for comparing the pronunciations of the same vocabularies, and the other section for comparing the differences in how people say the same thing in different spoken variants. — Instantnood 10:02, Apr 5, 2005 (UTC)
I thought about this a bit and decided to go with the fully colloquial rendition. The link above now has 我屋企有兩個大人同埋一個細路。 --Umofomia 06:04, 6 Apr 2005 (UTC)
I can provide a sample of Hakka (from Hong Kong) if anyone's interested. I've tried two ogg encoders, one "oggenc.exe" is command line driven, the other WinVorbis runs under windows, and WinVorbis seems to make smaller .ogg files after converting the same wave file. Dylanwhs 20:39, 4 Apr 2005 (UTC)
About this sound Hong Kong Hakka dialect 
/Na11 vuk3 kha33 zju33 liON31 tsak3 thai53 Nin22 thuN11 mai11 zit3 tsak3 sE53 Nin11 tsai31/ Dylanwhs 14:36, 5 Apr 2005 (UTC)
Cool! Do you know the character equivalents of your Hakka sentence? I could recognize 屋企 and 同埋, which I didn't realize were used in Hakka too. --Umofomia 06:19, 6 Apr 2005 (UTC)
Ah I was slightly off... I did some cross-referencing with a few sources and found that vuk3 kha33 is actually 屋下, not 屋企. I was able to figure out the characters for most of the sentence by looking through these sources: 我屋下有兩隻大人同埋一隻細人仔。 Not 100% sure though of course. --Umofomia 05:52, 7 Apr 2005 (UTC)
Actually, that's quite close. The character for vuk3 kha33 are found in some Hakka reference sources I've seen. The character for "I", (Mandarin wo3) is not 我 (which is pronunced /NO33/ [ŋɔ33]), but composed of 亻 and 厓, but in on the internet, characters like 崖 have been substituted instead. However, our example here is of the possessive pronoun "my", and in Hakka it is Nai11 ke53, but it can be shortened to /Na33/, which is what I've used here. The corresponding character has been identified in some literature about Hakka as the classical pronoun 吾.

The IPA rendering of the sentence example I gave in the ogg file is : [ŋa11 ʋuk3 kʰa33 zju33 liɔŋ31 ʦak3 tʰai53 ŋin11 tʰuŋ11mai11 zit3 ʦak3 sɛ53 ŋin11ʦai31]. The traditional characters would thus be 吾屋下有兩隻大人同埋一隻細人仔, and the simplified, 吾屋下有两只大人同埋一只细人仔.

Like the Cantonese example, there were so many words in Hakka for child (and depending on age too) that could have been used. If it had been a child as in baby, then you'd've found that each dialect group would have had it's own particular regionalism for it. Dylanwhs 16:45, 7 Apr 2005 (UTC)

Apart from the recordings, should we present the sentence in a common set of phonetic symbols, say IPA or SAMPA, as well? Comparison would be easier. — Instantnood 10:31, Apr 4, 2005 (UTC)

IPA transcriptions is a good idea, as long as we can get the transcriptions right. We need people who know how to transcribe Gan, Wu and Hakka properly or we'll have to solve in some other fashion.
However let's not get carried away with the pronunciation files. We don't need more five different examples and only one version of each example. The article is already at 34 k and I think it's best if we don't let it swell to more than 40. Even that is pushing it. We need a nice concise summary of grammar and phonology (with pronunciation files), trim the links and then it's off to FAC.
Dylan, a sample of Hakka would be great. Please submit a test recording. Peter Isotalo 10:44, Apr 5, 2005 (UTC)
See above. Dylanwhs 14:36, 5 Apr 2005 (UTC)


Ok, let's try to do a summary of what we've got. Here are the all the recordings so far.


I will try soon. — Pekinensis 23:00, 6 Apr 2005 (UTC)
From what I could make out, the sample in the Wannian dialect sounds like [au51 wuɔ31 lə33 jiu33 tiɔŋ53 tʰai33 ni:n35 ji51 kə33 si:11 tʰai33 ni:n35] Feel free to correct the rendering, especially the tone contours. Dylanwhs 17:13, 7 Apr 2005 (UTC)
That seems pretty good to me, except I would have started off with [ŋa wulə], and transcribed "一" as [i] and "小" as [ɕi]. I'm not sure about the tone contours at all. What do you think? Pekinensis 17:51, 7 Apr 2005 (UTC)
One really needs a better recording. The beginning is clipped off, I couldn't make out the ŋa part, and there is a bit in the middle which is loud, then it quietens significantly before becoming louder again for the end of the sentence. 19:35, 7 Apr 2005 (UTC)
Marital strife will result if I press for another sample right now, but I may have an opportunity by the weekend. Pekinensis 19:59, 7 Apr 2005 (UTC)





I will rerecord the Cantonese sample at a slower pace later today. --Umofomia 16:38, 6 Apr 2005 (UTC)
I uploaded a slighly slower version, but it's only imperceptibly slower... I couldn't really make it much slower than it is now without it sounding really unnatural. --Umofomia 06:04, 7 Apr 2005 (UTC)

We should not get carried away with too many examples. I suggest we pick five of these and present them in the Phonology or Spoken language section. Here's my draft for a table to put them in. Please add any suggestions and IPA to the table.

Five seems like an arbitrary number. If we are picking numbers, seven seems like a very attractive one. Surely we can find a sample of Taiwanese. I don't know any Xiang speakers, but we can always recruit on Chinese wikipedia. — Pekinensis 23:00, 6 Apr 2005 (UTC)
Are you still looking for a sample of Taiwanese? I'm not near a microphone right now, but I can try to record one soon. In the mean time, here's the information I can gather for the sentence (I think it's mostly right, but feel free to correct): IPA, all surface tones: gun55 tau55 u2121 e33 tua21 laŋ35 kap4 tçi32 le33 gin55 a53, Trad: 阮兜有兩個大人及(佮?)一(蜀?)個囡仔, Simp: 阮兜有两个大人及(佮?)一(蜀?)个囡仔 --ian 14:37, 20 September 2005 (UTC)
Dialect/Regional variety IPA-transcription with subscript tone contours Traditional Characters (?). Simplified Characters (?). Pronunciation file
Gan (Wannian dialect) IPA 我屋le有兩個大人一個小大人。 我屋le有两个大人一个小大人。 About this sound listen 
Hakka ŋa11 ʋuk3 kʰa33 zju33 liɔŋ31 ʦak3 tʰai53 ŋin11 tʰuŋ11 mai11 zit3 ʦak353 ŋin11 ʦai31 吾屋下有兩隻大人同埋一隻細人仔。 吾屋下有两只大人同埋一只细人仔。 About this sound listen 
Mandarin (Beijing dialect) wuɔ tɕali jou̯ liɑŋke tʂʰɤŋ ɻən xə ike ɕiaʊ hɑɻ 我家裡有兩個成人和一個小孩兒。 我家里有两个成人和一个小孩儿。 About this sound listen 
Wu (Hangzhou dialect) IPA 漢字 汉字 About this sound listen 
Yue (Cantonese) ŋɔː23 uːk5 kʰei35 jɐu23 lœːŋ23 kɔː33 tɑːi22 jɐn21 tʰuːŋ21 mɑːi21 jɐt5 kɔː33 sɐi33 lɔːu22 我屋企有兩個大人同埋一個細路。 我屋企有两个大人同埋一个细路。 About this sound listen 

Recording methods: Some of the recordings have a great deal of noise. It is best to use an external microphone because otherwise you will pick up some noise from your hard drive. The system I use most successfully is a 500 or 600 ohm microphone. (The impedance of the microphone is extremely important. If the impedance is too high or too low, then there will be an impedance mismatch, a common source of problems when connecting radios or microphones to recording equipment.) P0M 19:57, 7 Apr 2005 (UTC)

Are the Beijing dialect and Standard Mandarin the same in the chart? The Beijing dialect article does mention there are some differences. — Instantnood 14:17, Apr 8, 2005 (UTC)

Here is another Mandarin, native speaker, female. If it sounds o.k. I can ask the speaker to re-do it with the correct wording. Sorry, I hurried a bit. About this sound listen  P0M 01:47, 9 Apr 2005 (UTC)

The recording was of good quality, but I can't help noticing that she left out the "里", "和" or "儿". I've only studied Chinese on a very basic level, but I've gotten the impression that especially the -儿 ending is a very important part of colloquial speech in Beijing. From what Znode has told me, it sounds like she was trying to pronounce it in Standard Mandarin rather than her own dialect. Is this the case here? Has the speaker been instructed to use colloquial forms?
I would very much appreciate another recording. Try to coach the speaker into speaking clearly and fairly slow, but without making it sound formal. The more colloquial the better. After all, we're not writing a pronunciation guide. :-) Peter Isotalo 14:23, Apr 13, 2005 (UTC)

I discussed the sentence in class today. We have 3 native speakers. I gave them the English and asked them to write out the colloquial version. We had a consensus version: Wo jia you liangge da ren yige xiao har. We had one Beijing colloquial version: Wo jia you liar3 da (ren?) yige xiao har. (I forgot to write it down, sorry.) We had one Tianjin version: Wo jia liang da yi xiao. One of the ladies made the recording already provided here. I told them people thought the recording was too fast. They laughed and suggested we have the other young lady make a recording. (She speaks much more rapidly.) Unless you are dealing with a person who has schooled himself/herself to speak slowly without changing the general sound contours of the sentence, you take a very strong risk of getting a very artificial-sounding delivery. I can digitally slow down a recording, but then you may get a basso profundo sound. I will try to get another recording on Friday. Maybe I can coach her to slow down a little, but what you are asking is a little like asking a drawling Georgia girl to speak as rapidly as someone fast from Maine. And note that the only student who used "li3" and "he2" was not a native speaker. P0M 00:17, 14 Apr 2005 (UTC)

The natives decide, for sure. Go for the consensus version at the rate they feel is appropriate. Peter Isotalo 08:54, Apr 14, 2005 (UTC)

I guess the chinese characters for the Wu recording should be 我家裏有兩個成人和一個小兒 (simplified: 我家里有两个成人和一个小儿) --Joseph Y.C. Choi 21:49, 23 May 2005 (UTC)


I think you're pretty ambitious with the IPA there (Who am I talking to? This section does not appear to be signed.), but it's certainly a noble goal.

I don't know what to do about the "le" in the Gan sample. I really don't know the right character, and I'm not comfortable with making one up. Maybe I'll try writing someone at the department of linguistics at Nanchang University.

We really need a native Beijing speaker.

Pekinensis 23:00, 6 Apr 2005 (UTC)

Sorry, I forgot to sign. The IPA is tricky, but I don't think we can do without it. Just giving ppl the characters is not enough.
And we do need native speakers for all recordings. Can anyone contact someone at Chinese Wikipedia? 50% female speakers would be preferable too. Peter Isotalo 11:36, Apr 7, 2005 (UTC)
See above. If you want it done another way I can try again on Monday. P0M 01:48, 9 Apr 2005 (UTC)

Rank Qualification

In the article it states, next to the rank by numbder of speakers (if considered a single language). It's my understanding that even if one were to split it up and take only the largest unit (Mandarin) it would still be more than double the amount of speakers for the second most commonly spoken language (Hindi), at least according to the statistics here on wikipedia. Do we need this qualification? siafu 05:23, 30 Mar 2005 (UTC)

Good point, siafu. There's been talk of removing the ranking from the language template altogether, and I think this a good reason why. English is most likely the number one language, since it has so many second language speakers, and not even all of the Chinese dialects combined can beat that. There is too much uncertainty even among the top 10 languages. I think you should mention this at Template talk:Language and support the suggestion of removing the ranking altogether. It might be better if we stick to just reporting the estimated figures and let people figure out the rankings for themselves. Peter Isotalo 21:10, Mar 30, 2005 (UTC)
How many English-speakers are there actually, including all second-language speakers? — Instantnood 10:35, Apr 4, 2005 (UTC)

"Language", "languages" or "(linguistics)"

If we can argue on whether each spoken varient division is a language or a dialect, and we ended up with using "(linguistics)" in titles, the title of this article can also be arguable. If the divisions are languages then this article would be titled "languages". Any opinion? — Instantnood 10:35, Apr 4, 2005 (UTC)

This has as far as I know been discussed at Wikipedia:WikiProject Languages and it's the naming convention they have agreed upon that is being followed here. If you wish to discuss it, you should do it at their talkpage. Peter Isotalo 20:12, Apr 4, 2005 (UTC)
Thanks. Started a discussion at Wikipedia talk:WikiProject Languages#Chinese language(s). — Instantnood 10:15, Apr 5, 2005 (UTC)

Question classification as "subject object verb" language

The article says: "Chinese is a Subject Object Verb language and..." There is no citation to back this statement up, and the article on SOV languages doesn't back it up either.

The Taxonomy of Chinese doesn't have to look like the taxonomy of European languages. Basically, what you find is a tree that starts like this:


I. Topic-Comment Sentences II. Subject-Predicate Sentences

Many topic-comment sentences look like they were "derived" from Subject-Predicate sentences by moving the object to the head position in the sentence. Put another way, it is often possible to rearrange a Subject-Predicate sentence into a topic-comment sentence by moving the object to the beginning of the sentence. But the topic doesn't have to be a "would-be" object. For instance, "Ma1ma gei3 wo3de qian2, wo3 yi3 jing1 mai3le tang2 le." (As for the money mother gave me, I have already bought candy."

All subject-predicate sentences have a subject followed by a verb, not a subject followed immediately by an object. The level of the taxonomy chart below II would be populated by sentence forms such as: II.A. Subject Copula Noun-complement. (Wang2 Bi4 shi4 ren2.) II.B. Subject Intransitive Verb. (Wang2 Bi4 pao3.) II.C. Subject Transitive Verb Object. (Wang2 Bi4 xie3 shu1.) etc.

Note that II.C. is not: xxxWang2 Bi4 shu1 xie3xxx

The only way you can come close to SOV order is to use a special device that is actually an instance of the more general "substitution rule" PREDICATE = PREDICATEa (to) PREDICATEb (E.g., "Ta1 lai2 fu4 qian2." -- "He comes [to] pay money.") That is the so-called ba3 structure.

"Ta1 ba3 tang2 chi1 diao4 le." -- "He takes-in-hand the candy [to] eat it up."

II.D. Subject TV Indirect Object Direct Object is the Mandarin way to handle, e.g., "He gives me money", but Cantonese uses "Subject TV Direct Object Indirect Object, e.g., "He gives money to me." And there is still no SOV order.

Japanese is SOV, but Chinese is not SOV. P0M 06:39, 5 Apr 2005 (UTC)

You're right... I have no idea how that could have been missed... must have been a typo. Chinese is Subject Verb Object, not Subject Object Verb. BTW, if you're interested, there's also a recent conversation in Talk:Cantonese (linguistics)#Verb_Object_Subject about how the SVO order can sometimes appear to be VOS, even though it's really SVO. I think it can be generalized to most of Chinese as well, not just Cantonese. --Umofomia 06:48, 5 Apr 2005 (UTC)
Interesting... I looked at the history of the Subject Verb Object article and saw this edit by Curps with the comment "omit Chinese: not clearcut SVO vs. SOV (see for instance Li & Thompson)". However, I did some Google searching and found a paper that cites Li & Thompson and says Mandarin is SVO, which contradicts this claim. The only thing closest to the claim I could find was this, in which it says:
The verb ba was grammaticalized to a preposition to produce a construction of the form S prep O V (Li and Thompson 1974).
However the ba preposition is generally regarded as a coverb due to the serial verb construction feature of Chinese, so the sentence actually still retains its SVO word order. --Umofomia 07:13, 5 Apr 2005 (UTC)
I posted a question on Curps' talk page about this to see if he can explain. --Umofomia 07:26, 5 Apr 2005 (UTC)
Curps replied with the following on my talk page:
I have the Li and Thompson book (which is about Mandarin specifically). What they say is that Mandarin has features that are typical of SVO languages and also features that are typical of SOV languages, and may be (unlike other Chinese dialects) in the process of very gradually changing from an SVO to an SOV language. They do note that complex sentences are almost always SVO.
Anyway, if you wish to characterize Chinese as SVO I guess I have no objections. -- Curps 08:39, 5 Apr 2005 (UTC)
--Umofomia 23:54, 5 Apr 2005 (UTC)
In the linguistic discussions of Chinese that I have attended, it has always been characterized as SVO. While I think that this is a difficult characterization to make, it makes a bit more sense, IMHO, to stick with SVO over SOV (or VOS or SVC or...). siafu 00:19, 6 Apr 2005 (UTC)

Even if you go back to early times, the feature of Chinese that is most prominent on the most basic level of word organization is the frequency with which the topic-comment structure is used. Some textbooks seem to assert that this amounts to an OSV sentence order. I think that is a simplistic way of looking at things, but at least it is a commonly seen de facto order of words. Occasionally someone may say something like, "Ta1 zi4 xie3 de2 hen3 kuai4," rather than "Ta1 xie3 zi4 xie3 de2 hen3 kuai4." But to do that you have to give vocal prominence to the "zi4". Whether "zi4" in that sentence is a kind of topic, or whether the sentence is just one with some missing elements that the listener is expected to understand, it is a fairly rare thing to hear spoken, much less to see in print. Y.R. Chao (Zhao Yuan-ren) goes into this kind of question in fairly great detail in his Chinese Grammar. Anyway, I think the important thing is not to confuse the general reader and cause him or her to expect sentences like, "Gou3 rou4 chi-." It would be much more important to point out the unusual prominence of topic-comment sentences than to engage in speculation about what the language is on the way to becoming. P0M 00:43, 6 Apr 2005 (UTC)

Some argue that topic fronting is still SVO, but only appears to be OSV due to left dislocation, in fact, that's what is currently written in Topic (linguistics). --Umofomia 06:14, 6 Apr 2005 (UTC)

There is a big problem with that idea since the topic of a sentence can be followed by comment that consists of an independent clause (i.e., a subject, verb, and object). P0M 06:24, 7 Apr 2005 (UTC)

Going easy on article size?

The article seems to be growing quite rapidly, and we still have a lot left to write about. All of it is very good material and you're all doing a very good job on improving the article, but is it possible to shorten for example the Loan word section a bit? All those examples, especially all the Japanese just seems like too comprehensive. I think we need to attempt to achieve a proper balance between comprehensive and overwhelming.

What do the rest of you think about size? Peter Isotalo 11:47, Apr 7, 2005 (UTC)

If the article being overwhelming is a problem, we can split the article sections off if necessary, and replace the large sections with smaller overviews here on this page. siafu 18:39, 7 Apr 2005 (UTC)
Since this article is still being extensively worked on, I think for now we should keep it the way it is. Once we have all of the main ideas we want to convey in the article, we can start to split off content into subarticles. I already have plans to split off the morphology section I wrote, but will delay that for now until we get a better idea of what is and what is not relevant. --Umofomia 19:28, 7 Apr 2005 (UTC)



Due to this self-perception of a single Chinese language by the majority of its speakers, some linguists respect this terminology, and use the word "language" for Chinese and "dialect" for Cantonese, but most follow the intelligibility requirement and consider Chinese to be a group of related languages, since these languages are not at all mutually intelligible, and show ranges of variation comparable to those among the Romance languages.

It's far from clear to me that "most linguists" classify Chinese as a family of languages. Most linguists realize how complex the language/dialect distinction is and don't try to shoehorn languages into separate categories.,%20chapter%201.ppt

This link argues that Chinese consist of multiple languages....

Also its useful to bring up with Ethnologue has to say ....


A recent change has added the text:

[[Mandarin]] has been the dominent dialect spoken in the East Asian Chinese diaspora.

Actually, I don't think that is true if by "East Asian" the writer means S.E. Asia. The dominant version of Chinese spoken in Thailand, for instance, is (or at least was in 1966) Chao2 Zhou1 regional speech. And in the same year in Malaysia, almost no Chinese spoke Mandarin as their first language. It was, however, the official language of instruction in the schools that were organized around the Chinese language rather than Malay. (At least that is what I recall hearing at the time.)

In general, the "dominant dialect" depends on currents of emigration from China. Different parts of the world became preferred destinations for people emigrating from different parts of China. As far as I know, before recent decades at least the guan1 hua4 qu1 were not primary areas of emigration. But it would be nice to have some well-researched demographical studies to go on. P0M 01:20, 3 May 2005 (UTC)


The loanwords section is mostly about Chinese vocabularies with foreign origin. Should loanwords in other languages with Chinese origins be mentioned as well? — Instantnood 02:30, May 3, 2005 (UTC)

Sure, but a different section would be best. -- ran (talk) 02:38, May 3, 2005 (UTC)

Mutual intelligibility as a standard

Here's a question an anon asked on the Chinese Wikipedia that I think deserves answering: if we're to apply mutual intelligibility consistently as a standard, then wouldn't we have a lot more languages that we purport to have right now?

For example, wouldn't we split Toishanese out of Cantonese? The two southern Wu branches out of Wu? The fringe dialects of Hakka out of Hakka, and some difficult southern dialects of Mandarin out of Mandarin?

The anon is basically questioning the idea of using intelligibility as a standard, and implying (I believe) that it is unworkable to treat Chinese as anything other than a single language.

-- ran (talk) 17:27, May 31, 2005 (UTC)

There are several ways of dealing with this kind of problem. Sometimes people think things are unintelligible but then they get used to regular changes between, e.g., Beijing hua and Yunnan hua, and decide they aren't so unintelligible after all. Sometimes two people are talking and one can understand the other but not vice-versa. That may indicate a better yu3 gan3 on the part of one of the speakers, it may indicate more experience on the part of one of the speakers, it may indicate hearing difficulty on the part of one of them, etc. So the "rule of thumb" standard of intelligibility is somewhat difficult to apply in some cases.

On the other hand, to imply that there is no difference among the tongues spoken won't work either; the problem is how to conceptualize and categorize the tongues in the best way. After that understanding is reached, one can use rule of thumb stuff to explain things in casual discourse.

What we have, always, is language that changes from speaker to speaker and generation to generation. With increases in time and distance, languages become more and more remote from each other. Gradually, a tree-shaped structure takes form. It is all one tree, and yet every leaf on it is different. What is most useful is to see the trunk of the tree, the major limbs that branch off from it, the branches that grow from each limb, the twigs that grow from each branch. Once you've got that structure in mind you can try to find agreeable terms to use to name these different features. P0M 00:23, 15 Jun 2005 (UTC)

Quotation marks in Chinese

Hi, over at the quotation mark article, it says that Chinese uses four pairs of quote marks. Could someone expand on when different pairs are used? Are there usage differences among Mainland, TW, HK, & Macao? —Tokek 23:26, 25 Jun 2005 (UTC)

Hmmm. Do you mean they use four different kinds of quotation marks? Let's see. They use normal English single and double quotation marks. (That's more often seen in simplified Chinese.) Then they use single- and double-thickness 90 degree angle figures. That type of quotation mark is more often seen in traditional Chinese texts. As far as I know there is no "rule" involved. I'm not sure about HK, Macao, Malaysia... My guess is that places doing printing in both English and Chinese, and places printing Chinese text from left to right would use the English style quotation marks more often. You really can't make them work very well when writing vertically. They're just not designed for standing anywhere but above the line of print. P0M 00:57, 26 Jun 2005 (UTC)

“…” and ‘…’ are not regulated punctuation marks of Traditional Chinese, however since they are easier to be input when using a keyboard, people often use them instead of 「…」 and 『…』 on informal occasions. -- G.S.K.Lee 02:16, 26 Jun 2005 (UTC)