Wikipedia talk:WikiProject Linguistics

Please add {{WikiProject banner shell}} to this page and add the quality rating to that template instead of this project banner. See WP:PIQA for details.

Linguistics Project‑class

	Linguistics portal This page is within the scope of WikiProject Linguistics, a collaborative effort to improve the coverage of linguistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.LinguisticsWikipedia:WikiProject LinguisticsTemplate:WikiProject LinguisticsLinguistics articles
Project	This page does not require a rating on Wikipedia's content assessment scale.

Shortcut

WT:LING

Wikipedia:Wikipedia Signpost/WikiProject used

Archives

Etymology archive
Phonetics archives: 1, 2, 3, phonology template
Theoretical Linguistics archive

This page has archives. Sections older than 90 days may be automatically archived by when more than 5 sections are present.

Welcome to the talk page for WikiProject Linguistics. This is the hub of the Wikipedian linguist community; like the coffee machine in the office, this page is where people get together, share news, and discuss what they are doing. Feel free to ask questions, make suggestions, and keep everyone updated on your progress. New talk goes at the bottom, and remember to sign and date your comments by typing four tildes (~~~~). Thanks!

Missing articles or sections?

At WP:MOSCAPS#All caps, we have:

Certain words may be written with all capitals or small capitals. Examples include: ... In linguistics and philology, interlinear glossing of grammatical morphemes (as opposed to lexical morphemes), and transcription of logograms (as opposed to phonograms)

However, grammatical morpheme and lexical morpheme are redlinks, and Morpheme doesn't cover these terms. There is Morphology (linguistics)#Lexical morphology, but the whole "section" is one sentence (though much of the first half of the article is about lexemes, so the section heading may be misplaced or superflous. Morphology has changed a lot ("nanosyntax"?) since my university days, so I'm not sure how to repair either Morpheme and Morphology (linguistics), or MOS:CAPS to make sense in this regard. Is there simply a better way to explain linguistic use of ALLCAPS in interlinear glosses? Is "grammatical morphology" an imprecise term for Morphology (linguistics)#Morpheme-based morphology? Or is this "grammatical" versus "lexical" split a differently-worded take on Morphology (linguistics)#Inflection vs. word formation? My inability to "just fix it" is a little embarrassing given my minor in linguistics, but it's been a long time and I never was really much into the morphology side to begin with. PS: Should all-caps and small-caps styles be considered interchangeable for this purpose. PPS: Can anyone construct a concise example, or point to an exemplary one already in an article? — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 10:44, 24 November 2017 (UTC)[reply]

@SMcCandlish: I think you're overthinking it. It's just saying when glossing you don't capitalize the gloss corresponding to a morpheme meaning like, "dog", but you capitalize standard glossing abbreviations for morphemes that are have a more grammatical role, including inflectional like pl, derivational like caus, and independent words like aux or det or w/e. See e.g., the IJAL style sheet [1], the Leipzig glossing rules [2], Bauer's "The Linguistics Student Handbook": In glosses, the translations of lexical items are presented in lower case roman type, while glosses of grammatical information are presented in small capitals. or Macaulay's Surviving Linguistics Put glosses of grammatical morphemes into a font which contrasts some way with the font used for glosses which translate lexical morphemes. In the examples above, I've used small capitals for the grammatical morphemes. Others just capitalize the first letter of the gloss, or capitalize the entire word.

These seem to be the most relevant entries in the Oxford Dictionary of Linguistics if you want defs:

grammatical meaning: Any aspect of meaning described as part of the syntax and morphology of a language as distinct from its lexicon. Thus especially the meanings of constructions and inflections, or of words when described similarly. Such words include, in particular, ones belonging to closed rather than open classes, or those seen as marking a syntactic unit. Thus he has a grammatical meaning in opposition to other members of a closed class of personal pronouns; if as the marker e.g. of an indirect question in I asked if they were coming. A ‘grammatical word’ or ‘grammatical morpheme’ is accordingly a unit described, with whatever justification, in this mode. E.g., in the walls, both the and the plural inflection (-s) are distinguished as grammatical units from the lexical unit wall.

lexical meaning: Any aspect of meaning that is explained as part of a lexical entry for an individual unit: e.g. that of ‘to run’ in He ran away as opposed to that of ‘to walk’ in He walked away. Hence specifically in application to a lexical word or lexical morpheme as opposed to one which is assigned grammatical meaning: thus, in the same examples, of the meanings of the verbs and of the adverb away as opposed to those of the past tense or of he.

It looks like the most relevant wikipedia articles would be functional morpheme and content morpheme... Umimmak (talk) 11:28, 24 November 2017 (UTC)[reply]

A number of words, phrases, and morphemes mentioned by Umimmak were not set off from the text in any way; e.g., "he" in

Thus he has a grammatical meaning in opposition to...

I have italicized them. --Thnidu (talk) 04:22, 15 January 2018 (UTC)[reply]

@SMcCandlish: apparently I don't know how pings work. Take two. Umimmak (talk) 00:34, 25 November 2017 (UTC)[reply]

Got it. I'll go over that and revise the MOS line item to make clearer sense (if someone doesn't beat me to it). — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 00:58, 25 November 2017 (UTC)[reply]

I've completely overhauled the wording there to make sense to "mere mortals" [3]. Thanks for the help. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 08:24, 27 November 2017 (UTC)[reply]

One last clarification on this part: There's an old instruction in there that "Transcription of logograms (as opposed to phonograms) can also be done with small caps or all caps." What applicability could this have here? I don't see this used in Wikipedia anywhere; all the direct representations of logograms are given "as they are" (樂) with the appropriate {{lang|zh}} or whatever markup (and many logogrammatic languages have no upper/lower case system, at least not in Unicode); Romanized transcriptions are given in italics (yuè); and English glosses [canonically] in single quotes ('music'). In actual practice, much of all three forms of markup is missing or wrong (e.g. double quotes on English glosses, and so forth). This was true at Logogram, which I just overhauled (other than things like yuè are not marked up as {{lang|zh-[something here]|yuè}}; I don't know the particulars of such stuff for Chinese).

Anyway, the mystery reference to logograms in the MoS wording has been commented out for now. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 08:24, 27 November 2017 (UTC)[reply]

@SMcCandlish: Maybe it's about cuneiform? https://en.wikipedia.org/wiki/Cuneiform_script#Transliteration https://en.wikipedia.org/wiki/Sumerian_language#Sample_text Umimmak (talk) 08:30, 27 November 2017 (UTC)[reply]

@Umimmak: That sounds plausible, i.e. that it's an extension of the HIC IACET style for Classical Latin to other ancient languages, including those in other scripts. It seems a bit superfluous if so. However, something's going on at the second of those articles, with some stuff in this style and some not, and it's not clear [to me] what difference this is intended to signify (but it may be important to get this right): "30–31: SAḪAR.DU₆.TAKA₄-bi eden-na ki ba-ni-us₂-us₂". Whatever it is, this would surely be less annoyingly shouty as "30–31: SAḪAR.DU₆.TAKA₄-bi eden-na ki ba-ni-us₂-us₂". — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 09:08, 27 November 2017 (UTC)[reply]

Found a hint at Dingir: "By Assyriological convention, capitals identify a cuneiform sign used as a word, while the phonemic value of a sign in a given context is given in lower case." But there's no source for this. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 09:10, 27 November 2017 (UTC)[reply]
Source: "Never put logograms in capitals: only uninterpreted sign names, and complex signs are in upper case [4]", which is not quite the same statement. And this appears to be a set of instructions for a special form of encoding, not for writing natural-language linguistic prose that includes some cuneiform. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 09:16, 27 November 2017 (UTC)[reply]
Another, saying something related but different: 'If the letters that make up the transliteration are written in upper case, e.g., “PA” ..., then the transliteration merely refers to or represents the cuneiform sign without making any claim about how the sign is pronounced. Letters in lower case, e.g., “pa” ..., presuppose a phonetic interpretation on the part of the modern text editor.' [5]

Blatantly conflicting convention: "Akkadian words are given in italics, with logograms set in small capitals" [6], and "Transliterations: ... texts are set with Sumerian logograms in small capitals and Akkadian words in italics; unknown readings are given in large capitals." [7]
A third system, encountered in several works: "[D]ifferent formats are used to distinguish between Hittite words, Sumerograms, and Akkadograms ... [E]verything Hittite is lower case .... Sumerograms are given in roman capitals (in this book in small capitals: EN) .... Akkadograms are also capitalized but italicized ...."[8].
So, this is messy. I'm suspecting that similar conventions exist for other specialized areas of study; this stuff can probably just be an example in a footnote, to a line item that, in some wording, says something to the effect of "In particular linguistic subfields, like Assyriology^[fn1], there are special conventions for the use of all caps and sometimes small caps. When the convention is not distinguishing between all and small caps, normalize to small caps to be easier on readers' eyes. Regardless, use a consistent style throughout an article." Does that seem like a reasonable approach? — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 09:38, 27 November 2017 (UTC)[reply]

@SMcCandlish: Two relevant pages from Fortson [9]. Umimmak (talk) 09:21, 27 November 2017 (UTC) Addendum: If I am right and the MOS was in reference to writing Sumerograms, perhaps you should ask Wikipedia:WikiProject Ancient Near East as well. Umimmak (talk) 09:37, 27 November 2017 (UTC)[reply]

I'll do that, though I think this is not ultimately going to be entirely about that stuff, but just a general "don't use full-size ALL CAPS without reason, and use a consistent system intra-article" statement. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 09:43, 27 November 2017 (UTC)[reply]

The crossposting has been done. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 09:59, 27 November 2017 (UTC)[reply]

Draft:Comparison and Comparison (grammar)

I have started Draft:Comparison, and in the course of expanding it found that Comparison (grammar) is in poor shape as far as sourcing goes. Any help improving these would be appreciated. Cheers! bd2412 T 19:13, 22 December 2017 (UTC)[reply]

There is also an article at Comparative, which might need improvement, or perhaps merging into one of the articles BD2412 is working on. (There used to be (c. 2016) still another article at Superlative, but I merged it to Comparison (grammar).) Cnilep (talk) 23:46, 15 February 2018 (UTC)[reply]

Contradiction in Help:IPA/Inuktitut

Duplicate : Help talk:IPA#Contradiction in Help:IPA/Inuktitut

Chaos in romanization of Kazakh

New York Times article [10]. Short version: Kazakh is being romanized by 2025, away from Cyrillic. The plan has been to use diacritics, as in Turkish. The dictatorial president of Kazakhstan is trying to force this to instead be done with an F-load of apostrophes, which would interfere with things like search engines and generally make the language unreadable. There seems to be roughly 70% opposition to the idea, but he's powerful and may get his way unless he kicks the bucket in the interim. A short video lays out the issue (and you don't need to know the language to follow it) [11].

We should probably cover this at the article Kazakh language, and possibly also in summary at Kazakhstan, Kazakhs, and Nursultan Nazarbayev. It may also have implications for how we render Kazakh names in Latin transliteration on Wikipedia. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 19:50, 16 January 2018 (UTC)[reply]

@SMcCandlish: There's already a fair bit of coverage, with refs, in Kazakh language# Writing system, including the one below. We could probably use several of them here.

Also, President Nazarbayev’s office opposed the linguists' diacritic proposal:

In August, the linguists proposed using an alphabet that largely followed the Turkish model.

The president’s office, however, declared this a nonstarter because Turkish-style markers do not feature on a standard keyboard.^[1]

Though getting and publicizing the necessary plug-ins and key combinations to manage the diacritical marks would be easier than getting search engines to respect Nazarbayev’s catapostrophic [sorry, couldn't resist] proposal (by a factor of about ∞:1), we really ought to mention this. --Thnidu (talk) 22:18, 16 January 2018 (UTC)[reply]

References

^ Higgins, Andrew (2018). "Kazakhstan Cheers New Alphabet, Except for All Those Apostrophes". The New York Times. ISSN 0362-4331. Retrieved 2018-01-16.

Sounds good, as to what to cover. As for what they'd need to do over in Kazakhstan, it's probably use Turkish keyboards or get some made that are close to them. I don't buy the "not found on standard keyboards" thing because they don't use "standard keyboards" in a Western sense, but mostly Cyrillic ones, and Kazakh is written in a variant of Cyrillic that, like many others, was specifically designed to conflict with neighboring variants to prevent pan-Turkic literature comprehensibility for political reasons. I.e., they already have a bear of a keyboard problem. — SMcCandlish ☏ ¢ >^ʌⱷ҅_ᴥⱷ^ʌ< 22:32, 16 January 2018 (UTC)[reply]

Bot for WP:WPENGLISH

I've made a request at at WP:Bot requests#Tag talk pages of articles about English with Template:WikiProject English language to have the articles within the project scope bot-tagged, since doing it by had or even with AWB might be an enormous amount of effort. I'm not sure if BOTREQ requires a showing of support before action is taken to implement a bot, but I get the sense that this might be the case. — SMcCandlish ☏ ¢ 😼 09:55, 23 January 2018 (UTC)[reply]

Template:Interlinear

We've now got a template for formatting interlinear glosses: {{interlinear}}. At this stage, it will be really helpful to receive some feedback on its overall structure, like the parameters used or the various default behaviours (for example with respect to the presence of free translations, or the formatting of glossing abbreviations). All these things will be difficult to change once the template becomes more widely used. Your input is welcome at Template talk: interlinear. Bug reports or feature requests will be appreciated as well. – Uanfala (talk) 17:53, 28 January 2018 (UTC)[reply]

Beijing dialect and Beijing Mandarin

Input at Talk:Beijing dialect#Comparison of Beijing Mandarin and Beijing dialect would be appreciated. – Joe (talk) 17:21, 31 January 2018 (UTC)[reply]

Help with IPA

A user added alternative pronunciation of a separate syllable next to IPA in opening sentence of Israel (changed from "Israel (/ˈɪzreɪəl/)" to "Israel (/ˈɪzriəl, -reɪ-/)"). I'm not sure this is how it works. The discussion is at Talk:Israel#Pronunciation. --Triggerhippie4 (talk) 11:45, 7 February 2018 (UTC)[reply]

Links to DAB pages

I am a WikiGnome, and perhaps waste too much of my life fixing links to DAB pages. I have collected links to several linguistics-related articles which contain {{disambiguation needed}} tags and which I dare not try to fix. Can any of you experts help resolve these problems? Search for "disam" in the articles listed below. If you solve a problem, take off the tag and post {{done}} here.

Done Q
Done Sama–Bajaw languages
Done Intensifier
~~ISO 639:k~~ ISO 639:kuq
Done Common European Framework of Reference for Languages
Done Yiwom language
Done Danke Schoen (I suspect that in this article, the answer might be to take off the link to Low German languages altogether. The article seems to be referring to those German dialects spoken in USA, not specifically to e.g. Plattdeutsch or Pälzisch.)

There may be another dozen or so links like these, which I will find during my rounds, and which I hope you experts can fix. Yrs, Narky Blert (talk) 22:45, 15 February 2018 (UTC)[reply]

My brother discovers a new grammar rule

Okay, so perhaps this is not the appropriate place to ask this, but I am asking anyway. My brother recently "discovered" a rule in English. I am wondering if this is commonly known thing. It goes like this:

"The rule applies to pairs of two-syllable words that are spelled the same (homographic), are pronounced differently, where one word is a noun and the other is a verb.

The rule is "The noun accents on the first syllable, the verb accents on the second syllable."

Examples: reject, record, rebel, repeat, rerun, replay, redo, refuse, project, object, defect, produce,console, convert, contract, (undo ?)

Counter examples: I can find none."

Just for the record my brother is a mathematician and economist, but since our mother passed away he has had to pick up the mantle of family grammarian. Thanks, Einar aka Carptrash (talk) 05:59, 2 March 2018 (UTC)[reply]

I don't know how commonly known this thing is to English speakers (after all, grammars of natural languages are immensely complex and native speakers go by perfectly well without the need to be aware of it all), but it's certainly common knowledge among learners of English past the intermediate stage. As far as I'm aware, this rule affects mostly vocabulary of Latin/French origin (though there are exceptions like "uplift"), and usually it's the verb that is historically earlier. There are also pairs of disyllabic words without a change in stress ("access"), and pairs of longer words with a similar stress shift ("àttribute" vs. "to attrìbute"). Anyway, English is a really well studied language, so it's unlikely that someone can discover a hitherto unknown rule unless they're looking at an extremely obscure variety of English (like the jargon employed by workers in the colliers of southern Alabama), or it's a case of an extremely subtle phenomenon that arises in particularly complicated syntactic or semantic contexts. Still, even if the rules are widely known among linguists, it doesn't mean that they have been easy to figure out. For a native speaker of a language, it's generally pretty hard to become aware of even a tiny fraction of the rules that they implicitly use in every utterance. Discovering these rules is challenging, but fun, and I'm sure there's plenty more in store for your brother. – Uanfala (talk) 00:53, 3 March 2018 (UTC)[reply]

Thank you Uanfala, but please do not encourage my brother. Life is hard enough as it is. Carptrash (talk) 17:52, 3 March 2018 (UTC)[reply]

Wikipedia already has an article about this if you're curious: Initial-stress-derived noun. Umimmak (talk) 01:00, 3 March 2018 (UTC)[reply]

One of the wonders of wikipedia, @Umimmak: is that if I go to the right place, in this case HERE, I can learn, or someone will point me, to just about anything. I will pass this link on. And if there is anything you'd like to know about architectural sculpture . . . . . . . .............. Carptrash (talk) 17:56, 3 March 2018 (UTC)[reply]

Dakhini

Can anyone look at the recent edits to the article Dakhini? There is a "The Legend" section with poetic descriptions, population of "Kafir" speakers and other changes that seem problematic to me. utcursch | talk 16:44, 6 March 2018 (UTC)[reply]

I've removed most of them. I've left the "Legacy" section untouched though: some of it is sourced and the rest of it might as well turn out to be alright, but I think it needs a closer look from someone more familiar with the topic. – Uanfala (talk) 19:54, 6 March 2018 (UTC)[reply]

LoveVanPersie's disruptions

There's a thread on Administrators' Noticeboard concerning LoveVanPersie's disruptions. He's posted over 50 incorrect transcriptions in the last 4 months. Please join the discussion if you have anything to contribute. Thank you. Mr KEBAB (talk) 02:13, 16 March 2018 (UTC)[reply]

Proposed change to "Affect (linguistics)" page

I think there is a mistake on the "Affect (linguistics)" page, in the section where it discusses Korean.

Specifically, where it says:

맛있잖아 Masi-ittjianha (lit. "It's not delicious," but connotes "It's delicious, no?")

There are two problems. First, the "–잖아" (–jana) ending is used to indicate something the speaker thinks the listener is (or should be) aware of already^[1]^[2], not as a tag question (as the original writer seems to have intended). Second, the adjective "맛있다" (masitda) means the food is delicious, not that it is not. (I think the original writer meant to use "맛없다" (mateopda) which would mean "not delicious"). So the meaning of what the original writer wrote is actually "It is delicious, you know." To say, "It's delicious, no?" (a tag question seeking confirmation), it should be "맛있지?" (masitji) because "–지" is the ending used in Korean for that purpose.^[3]^[4]

Additionally, the Romanization the original writer used is confusing. I've put corrections below:

맛있어요 "Masi-issoyo" should be "masisseoyo"

맛있군요 "Masi-ittgunyo!" should be "masitgunyo!"

맛있잖아 "Masi-ittjianha" should be "masitjana" (but when corrected to "맛있지?" as above, it would be "masitji?")

맛이 없다 "Masi-eopda" should be "mas-i eopda"^[5]

24.124.60.249 (talk) 04:39, 25 March 2018 (UTC)24.124.60.249 (talk) 04:46, 25 March 2018 (UTC)Em[reply]

^ "Lesson 90: The meaning of ~잖아(요)". How to Study Korean. Retrieved 25 March 2018.
^ Korean: A Comprehensive Grammar. Routledge. 2011. p. 377. ISBN 978-0-415-60385-0.
^ "Lesson 93: ~지 and ~죠". How to Study Korean. Retrieved 25 March 2018.
^ Korean: A Comprehensive Grammar. Routledge. 2011. p. 379. ISBN 978-0-415-60385-0.
^ "Revised Romanization of Korean: Transcription rules". Wikipedia.

I have copied this comment to Talk:Affect (linguistics). Editors interested in editing the page are likely to see it there. Cnilep (talk) 07:48, 26 March 2018 (UTC)[reply]

Thank you! 24.124.60.249 (talk) 04:46, 25 March 2018 (UTC) Em[reply]

[1] Higgins, Andrew (2018). "Kazakhstan Cheers New Alphabet, Except for All Those Apostrophes". The New York Times. ISSN 0362-4331. Retrieved 2018-01-16.

[2] "Lesson 90: The meaning of ~잖아(요)". How to Study Korean. Retrieved 25 March 2018.

[3] Korean: A Comprehensive Grammar. Routledge. 2011. p. 377. ISBN 978-0-415-60385-0.

[4] "Lesson 93: ~지 and ~죠". How to Study Korean. Retrieved 25 March 2018.

[5] Korean: A Comprehensive Grammar. Routledge. 2011. p. 379. ISBN 978-0-415-60385-0.

[6] "Revised Romanization of Korean: Transcription rules". Wikipedia.

[1]

[1]

[2]

[3]

[4]

[5]