Talk:Word-sense disambiguation

From Wikipedia, the free encyclopedia
Jump to: navigation, search


'We shall deal with the deep approaches first'? Is this a transcription of a university professor's lecture, or what. Aardark 17:04, 11 June 2006 (UTC)

Homographs, vs. other WSD tasks[edit]

The "bass" example is a homograph, while sense disambiguation often deals with finer distinctions (like the people sitting at a table verses the actual table).

A song could be about fish, the first example isn't explicit enough to claim all people recognize it as the musical type of bass.

i really wish there was a disambiguation:disambiguation section. —Preceding unsigned comment added by (talk) 03:32, 16 November 2007 (UTC)

Bad example[edit]

It's not that hard to disambiguate "The dog barks at the tree", if you are using a tagger in the disambiguation process. 'bark' is a verb and as such can only be of the sense 'dog noise'. The examples given in the article don't adequately describe the challenges of WSD. -- (talk) 04:41, 10 March 2008 (UTC)

Mihalcea's algo's[edit]

There are other algos, and some of the most interesting and seemingly most successful, have been proposed by Radu Mihalcea. Article should mention, if not review these. I'm also thinking that the current algo description is over-long, wandering off into excess detail for what should be a more general article? linas (talk) 23:38, 9 May 2008 (UTC) 1


Is this an English word or a neologism? —Preceding unsigned comment added by (talk) 19:34, 14 September 2008 (UTC)

Disambiguation ("remove uncertainty of meaning from (an ambiguous sentence, phrase, or other linguistic unit)." in Oxford American Dictionary) is a real word. DEddy (talk) 20:07, 14 September 2008 (UTC)

It could still be a neologism. I wonder if it existed before W'pedia. It is pretty crazy. What on earth does it mean? Is there an *ambiguation? May one *ambiguate? Sure, there's ambiguity but what is this? Above we have 'It's not that hard to disambiguate' but does anyone ever say "disambiguate" out loud? Have you ever heard anyone say it with a straight face?
If you travel in the "right" circles it is used out loud, with a straight face & correctly. It rather depends who you rub elbows with. Some folks gravitate to big words, while others don't. It most certainly existed looooong before Wikipedia. DEddy (talk) 01:24, 23 March 2010 (UTC)
If it were a neologism, dictionaries would say so. English is highly irregular. For example, someone can be 'inept'. Can they be 'ept'? Does the illness 'distemper' imply a medical meaning for 'temper'? 'Disadvantaged' is used far more often than 'advantaged' (although that may be because of the disparity between the rich and poor). I have heard others say 'disambiguation' and 'disambiguate' many times, particularly in computer science classes. I myself say 'disambiguate' whenever it is the best word to use; I believe my face is always straight when I say it. Our use of words depends on many factors, such as the extent of our education (including whether we paid attention in class and did our homework or not), the speech patterns of the people around us during childhood as well as adulthood (adultery?). David Spector 01:01, 23 March 2010 (UTC)

Requested move[edit]

The following is a closed discussion of the proposal. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.

The result of the proposal was not moved per WP:COMMONNAME argument below. --rgpk (comment) 18:05, 8 February 2011 (UTC)

Word sense disambiguationWord-sense disambiguation — per WP:HYPHEN; it's a compound adjective. Yes, it's commonly used without the hyphen, but the hyphen is correct. It doesn't detract from recognizability, and there will be a redirect from the non-hyphenated version. This move was done BOLDly in February 2009 and reverted in May 2009 without discussion. And, please feed the animals, especially crackerjacks. --Pnm (talk) 19:50, 25 January 2011 (UTC)

  • Oppose WP:COMMONNAME google search shows that this is not the form use. Further, Wikipedia is not prescriptive, it uses the common form or descriptive titles, not prescribed titles. English does not have a central language diktat to dictate terms of use, unlike French (which has three such institutes) or German (which has two), or Spanish (in which each country has their own). (talk) 06:21, 26 January 2011 (UTC)
  • Oppose; as the editor from IP notes, we are not prescriptivist, so "but it's correct" is not an argument for a move. Gavia immer (talk) 08:20, 26 January 2011 (UTC)
  • Oppose - per WP:COMMONNAME and WP:USEOFHYPHENSISCHANGING. – ukexpat (talk) 20:19, 26 January 2011 (UTC)
    • WP:COMMONNAME is irrelevant. The proposed move is about punctuation. --Pnm (talk) 22:18, 26 January 2011 (UTC)
With all due respect it is completely relevant. Usage of hyphens in such compounds is, like it or not, changing and the more common usage now, at least with respect to this expression, appears to be not to use the hyphen.  – ukexpat (talk) 22:30, 26 January 2011 (UTC)
I partly agree. Wikipedia hyphen style conflicts with common usage. The question is whether, in this case, adopting common usage is more important than following the style guide. The name is set – we're not debating an alternative name, the subject of WP:COMMONNAME – we're debating punctuation, a style issue which is not discussed in WP:TITLE. --Pnm (talk) 23:08, 26 January 2011 (UTC)
  • Support: My understanding is the Manual of Style has precedence over common usage. I would expect this would be particularly preferable when adhering to the MOS would not make it overly difficult for those familiar with the common usage to find the article they are looking for. –CWenger (talk) 23:31, 29 January 2011 (UTC)
    • No it doesn't. Since WP:AT is a policy, while WP:MOS is a guideline. WP:PG clearly shows that policy trumps guideline, hence WP:COMMONNAME trumps WP:HYPHEN. (talk) 04:39, 30 January 2011 (UTC)
      • The policy is often broken then. For example, Wikipedia uses en dashes in article titles quite frequently, yet they are somewhat rare on the outside. Just do a Google search for "Michelson–Morley experiment" to see evidence of that. On the first page of results, Wikipedia is the top hit and it uses the en dash as per the example in the MOS. Of the rest, one uses a space and the rest use a hyphen. –CWenger (talk) 07:16, 30 January 2011 (UTC)
        • It's not so much that the naming policy is broken than that the MOS has been hijacked by typographical fetishists. olderwiser 12:58, 30 January 2011 (UTC)
        • The policy is not broken. Using dashes where it is not common should only be done with a consensus discussion that allows it, per WP:IAR. If it is done without support, it can be reverted per common name policy. This is a consensus discussion, so can establish whether IAR or COMMONNAME applies. (talk) 04:32, 31 January 2011 (UTC)
        • Rather, any guideline is broken whenever they contravene policies. (talk) 04:33, 31 January 2011 (UTC)
The above discussion is preserved as an archive of the proposal. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.

Relation to polysemy[edit]

Deleted the reference to polysemy 6/16 because technically speaking, the term refers only to groups of meanings that are distinct but related in some relevant way (cf the Wiki Polysemy entry). Thus 'mole' the rodent and 'mole' the scientific unit are not classified as polysemous but as non-polysemous homonyms. Yet the two are grist for the disambiguation mill like any other potential ambiguities at the word level. Experts please verify, and please excuse gaffes (this is my very first Wiki edit). Astigmatist (talk) 21:41, 16 June 2012 (UTC)

Recent revert of section blanking[edit]

When there is no discussion about removing sections of the article, when there are no edit summaries over the numerous changes by a single editor, there I must quickly revert (at least temporarily) without question. Discuss?? — CpiralCpiral 17:47, 20 September 2012 (UTC)

Sorry for the blanking out of the section without an explanation. But there is a valid explanation. The previous version of the WSD evaluation section contained outdated evaluation descriptions and links that was >10 year old and most techniques in evaluating WSD has changed drastically depending on which variety of WSD task. So i figured it is only appropriate to discuss about WSD evaluation specific to each variant of WSD. Most of the original content of WSD evaluation section had now been moved to Classic Monolingual WSD. It's because the monolingual WSD tasks is commonly referred to as "classic" in the recent evaluation workshops, see Alvations (talk)

Simplifying the language[edit]

I think the language needs to be simplified here. It is not only because the language is a bit flowery, but it is also hard to understand. For instance, what do you mean exactly by POS-tagging and WSD making constraints to each other? I am not a specialist in WSD, so I cannot do a lot of editing without risk of introducing errors. — Preceding unsigned comment added by Srchvrs (talkcontribs) 00:19, 28 October 2013 (UTC)

Deep vs Shallow approaches[edit]

It is clearly a very subjective division. Especially if you claim that deep approaches require a comprehensive knowledge base. For example, the deep QA architecture of IBM Watson does not rely on such knowledge. Yet, it is considered deep enough. At the very least, you cannot generalize and say such a distinction is universal in NLP (using knowledge base as the single criterion). Going down the rabbit-hole, you can use only surface forms and basic co-occurrence statistics. But, you can also do some basic semantic parsing. Or syntax parsing. How deep would it be? Some knowledge is there, it is not exactly shallow.

A better classification is given here:

— Preceding unsigned comment added by Srchvrs (talkcontribs) 00:30, 28 October 2013 (UTC)