Talk:Corpus linguistics

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Wiki Education Foundation-supported course assignment[edit]

This article was the subject of a Wiki Education Foundation-supported course assignment, between 22 January 2020 and 9 May 2020. Further details are available on the course page. Student editor(s): Nic Willow. Peer reviewers: Jkakajajk.

Above undated message substituted from Template:Dashboard.wikiedu.org assignment by PrimeBOT (talk) 18:30, 16 January 2022 (UTC)[reply]

deleted text[edit]

I deleted the sentence "Intuition is notoriously unreliable when it comes to making judgments about language.", because, although it is true, it did not flow with the text it was situated by. In fact, I don't think it's relevant to this particular article at all, even though it is certainly a valid linguistic fact.

lexis[edit]

One important finding of corpus linguistics is the interdependence of syntax and lexis, often referred to as lexico-grammar: words tend to occur in specific syntactic patterns, and these patterns are shared by words which share aspects of meaning as well.

Can someone tell me what is meant by the word "lexis"? Is this what American linguists would call "lexical semantics"? If so, then the sentence with the word "lexis" is claiming this: that the meaning of words and the syntactical forms in which those words find themselves are interdependent. This may or may not be true, but could someone explain how this is supported by Corpus linguistics? Do we have names of particular researchers, or particular studies?

I'm an American with some knowledge of linguistics, and I've never heard the term. (I have heard the term "lexical semantics"; there is a class on it at my university.) Since I also don't understand what "lexis" means, or how corpus linguistics supports the interdependence of anything, and since your question has gone a long time with no reply, I'm deleting the passage. --Ryguasu 02:20 Dec 6, 2002 (UTC)
The word "lexis" refers to the lexicon. Linguists differentiate between syntax (rules) and the lexicon (words).
In addition, the remark about intuition may have been there to reinforce the reason why people use corpus linguistics---it's a way of accounting for linguistic phenomena that relies on empirically-derived facts, not on intuition. I'll see what I can do about a re-edit. --Allolex

link[edit]

though I wasn't able to find a good place to include this link http://www.spaceless.com/concord/ I think it's an important tool for archivists...

Chomsky and Corpus Linguistics[edit]

The article states: The approach runs counter to Noam Chomsky's view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting. When did Chomsky say this and where? Do the two appoaches contradict each other or do they complete each other? --Hutschi 10:47, 7 Jul 2004 (UTC)

Further, I would like to know what "performance" means to linguists.

Performance & Chomsky[edit]

"Performance" is put in contrast to "competence". Chomsky believed/believes that the language "module" of the brain could be described in terms of a predictable machinery, like a computer. The access a healthy human has to this language module in the brain is the competence. "Performance" is what you get when the processing of the language module in the brain has to go through all the intermediaries so the proper sounds actually get into the air. So the brain, having generated something to say, will pass the linguistic utterance outside the language box and it could be corrupted by memory loss, tongue/mouth/motor function imperfection, etc...So what is observable, an imperfect utterance, is not representative of the linguistic competence of the speaker. This is what Chomsky wished to capture. --Temposs 07:51, 6 April 2007 (UTC)[reply]

In 1963, Chomsky rejected corpus linguistics in a way that some scholars still find insulting, and so they in turn reject Chomskian ideas. The above quote, in particular, is indicative of just how badly Chomsky got it wrong. Contrast John Sinclair, and his motto: "trust the text". There's a lively debate: see International Journal Corpus Linguistics, upcoming issue (2010), the "bootcamp" debate.
I'm not sure who takes Chomsky seriously any more, it seems to me that things have moved on. Complimentary approaches are better seen in Stefan Th Gries, who works on psycholinguistic/cognitive linguistics founded on corpus-linguistics as both theory and methodology.
This article should mention the "neo-Firth" movement (Wolfgang Teubert and Bill Louw) of "Corpus linguistics as a theory" (coming through John Sinclair) vs. "corpus linguistics as a methodology" (where its used as a set of techniques and tools and data instead of as a thing in itself.). linas (talk) 14:58, 23 April 2010 (UTC)[reply]

deleted text[edit]

Deleted the sentence: "The core of a corpus is the derivation of a set of Part-of-speech tags, representing a formal overview of the various types of words and word-relationships in a given language." as it is not really true. Most corpora are not annotated, and deriving tags is only a minor research interest.

Some sites on Corpus[edit]

Here are some sites on Corpus. But I am not sure whether they are useful.

http://www.tu-chemnitz.de/phil/english/chairs/linguist/independent/kursmaterialien/language_computers/whatis.htm
http://www.cambridge.org/elt/corpus/what_is_a_corpus.htm
http://www.tlumaczenia-angielski.info/linguistics/corpus.htm 

Verycuriousboy (talk) 14:41, 9 June 2009 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just modified 6 external links on Corpus linguistics. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 09:20, 13 August 2017 (UTC)[reply]