Talk:Central tendency

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated Start-class, High-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
 

used by statisticians[edit]

A Google search for "central tendency" and "department of statistics" will demonstrate that the term is used by statisticians.Jfitzg

ok.. whats your point? Fresheneesz 23:34, 18 March 2006 (UTC)

Merge with average[edit]

It looks like average and central tendency mean the same thing. If thats the case, they should be merged. The article on central tendency is so small that it would be an easy merge. Anyone agree? Fresheneesz 23:34, 18 March 2006 (UTC)

Yes, definitely needs a merge. If it is in some way different from an average, then by all means let's point out the difference. -- Derek Ross | Talk 23:42, 18 March 2006 (UTC)
They are not the same thing. The central tendency can be the mean, median, or mode; depending on the situation. So sometimes they are the same thing, but not always. There's a clear and concise explanation at: [1]. BTW- I have no affiliation with the site, it just made it very clear for me.
It seems you read mean, but Fresheneesz said average -- which can be the mean, median, mode, etc. Fgnievinski (talk) 07:11, 7 March 2013 (UTC)
I agree with the merge; I've tagged the articles accordingly. Fgnievinski (talk) 07:11, 7 March 2013 (UTC)
Don't Merge to Average or you'll get trampled by a horde of angry statisticians. Measure of Central Tendency is the term they use to cover mean, median, mode etc. Average is a colloquial term which almost always means Arithmetic mean but is occasionally carelessly used for one of the other measures. So I think most of the material from Average should be moved here and that article should be cut down to size, with a clear pointer to here. Dingo1729 (talk) 14:54, 8 March 2013 (UTC)
Don't merge. I agree with the approach outlined by Dingo1729 above. --Jeff Ogden (W163) (talk) 16:56, 8 March 2013 (UTC)
Don't merge. I also agree with the approach outlined by Dingo1729 above, and for the same reasons. Duoduoduo (talk) 22:40, 19 April 2013 (UTC)

Merge template removed. Melcombe (talk) 23:22, 19 April 2013 (UTC)

Melcombe stopped this merge discussion by simply removing the templates from this and the other article. That's clearly totally not following the process, but I'm not inclined to revert him because I'm involved and I also agree with him that this should not be merged. Please read up on WP:MERGE if you want close a merger discussion. If anyone else wants to revert, they can. Counting votes, it was only a 3 versus 2 discussion. But I don't think it will change the final outcome. Dingo1729 (talk) 16:34, 21 April 2013 (UTC)
Merge. The need is more clear now after recent edits, re. wide- vs. narrow-sense centrality. I agree with Dingo1729's reverse merge proposal. In fact, I think contents can be distributed between mean and central tendency. I've tagged sections accordingly. Average should be a much leaner article. Fgnievinski (talk) 20:19, 27 April 2013 (UTC)

Cluster?![edit]

The first sentence currently says

In statistics, the term central tendency relates to the way in which quantitative data tend to cluster around some value.

with a citation to

Dodge, Y. (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9

Apart form the fact the article should start out as "In statistics, a central tendency of a set of qualitative data is a value which....", this reference to clustering around the central tendency is just wrong, despite being referenced. It is quite possible for the central tendency to be far away from any of the data points -- e.g., consider the mean or median of -100, -100, 100, 100. So that's why I'm changing this even though, to my chagrin, it's been in here for years. If anyone can come up with a better version than mine, feel free. Duoduoduo (talk) 22:17, 19 April 2013 (UTC)

This is nonsense. The source is highly reputable. Don't change the meaning unless you can provide a source. "Central tendency" is not the same as "average" in any sense. Melcombe (talk) 23:14, 19 April 2013 (UTC)
Calm down, Melcombe! Be nice. We're both working to improve things, and there is no need to make things unpleasant for someone who you know perfectly well always edits in good faith.
You haven't answered my objection I stated above: if the article is right in saying that the mean can be a measure of central tendency, how can the central tendency be a cluster value given that sometimes the data do not cluster around the mean?
And if you object to one edit, you should just revert that edit rather than reverting three of them. For example, you say that central tendency is not the same as average in any sense, and yet you reverted my edit that replaced "average" with "central tendency". Duoduoduo (talk) 23:30, 19 April 2013 (UTC)
And are we even sure that the cited source said what the restored lead sentence claims? Maybe it said something like "Measures of central tendency are numbers that tend to cluster around the "middle" of a set of values" (i.e. the mean, median, mode, etc. tend to cluster together) rather than "central tendency relates to the way in which quantitative data tend to cluster around some value." Given the sloppy wording of the passage that you restored, presumably it was paraphrased from the cited source, and may not faithfully represent what it said. Duoduoduo (talk) 23:55, 19 April 2013 (UTC)
I am sure as I have seen the source. The reason sources are given on Wikipedia is so that people like you can check out the infomation for themselves, which you should do. It is certainly unhelpful to replace information from a reliable source with something that is a sloppily expressed version of something that you vaguely recall and for which you can't provide a source. Melcombe (talk) 10:34, 20 April 2013 (UTC)
I think that some of the confusion may be caused because the term "Central tendency" is almost never used on its own. In my experience it's always "measure of central tendency". I'm inclined to think that the article should be titled Measure of central tendency but I'm reluctant to rename the article if there are objections. Dingo1729 (talk) 01:39, 20 April 2013 (UTC)

OK, le's start from basics. If a population or set of data does not have a central tendency (i.e does not the property of clustering around a central value), then it wll be misleading or just silly to try to use a measure of central tendency to describe the population data, at least not without a great deal more descriptive information being supplied at the same time. Hence the first consideration here is "does this population have a central tendency", not "how do we quantify a typical value for members of the population". Melcombe (talk) 10:34, 20 April 2013 (UTC)

Requested move[edit]

The following discussion is an archived discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. Editors desiring to contest the closing decision should consider a move review. No further edits should be made to this section.

The result of the move request was: As nominator, I'm closing this as don't move. There was only a single !vote, much of which is moot because of changes to the article. Nobody else seems to care and I have come to realise that I don't care enough to push for the move. Dingo1729 (talk) 04:02, 7 May 2013 (UTC) (non-admin closure) Dingo1729 (talk) 04:02, 7 May 2013 (UTC)



Central tendencyMeasure of central tendencyMeasure of central tendency is a well defined and well referenced statistical concept. It is also what this article is about. On the other hand Central tendency is vague and open to interpretation. It might be debatable whether a particular distribution has or does not have a central tendency. But even if we agree that it doesn't have a central tendency it still certainly has its "Measures of central tendency". It's just the way that things have been defined. I know that it's tempting to think that if something has a measure then the thing itself must be well defined. But in this case it really isn't so. It's just a Math thing. Relisted. BDD (talk) 17:17, 1 May 2013 (UTC) Dingo1729 (talk) 15:54, 20 April 2013 (UTC)

Survey[edit]

Feel free to state your position on the renaming proposal by beginning a new line in this section with *'''Support''' or *'''Oppose''', then sign your comment with ~~~~. Since polling is not a substitute for discussion, please explain your reasons, taking into account Wikipedia's policy on article titles.
  • Oppose The article contains sources specifically for "central tendency". No sources have been given for "measure of central tendency". The article is about a quality a distribution might have, in the same way that "skewness" and "kurtosis" are qualities not particular ways of measuring these things. Note that Wikipedia is not a mathmatics encyclopedia so "It's just a Math thing" doesn't wash. And, if you don't know what you are trying to measure, how can you measure it? Melcombe (talk) 16:28, 20 April 2013 (UTC)

Discussion[edit]

Any additional comments:

If "Measure of central tendency is a well defined and well referenced", let's see some references. Note that the article has sources for the terms "measure of location" and "measure of spread" so it might be better to start separate articles for these. Melcombe (talk) 16:28, 20 April 2013 (UTC)

"Start" articles for these? Statistical dispersion. Location parameter. Duoduoduo (talk) 21:10, 20 April 2013 (UTC)
The above discussion is preserved as an archive of a requested move. Please do not modify it. Subsequent comments should be made in a new section on this talk page or in a move review. No further edits should be made to this section.

Lead sentence[edit]

Since Melcombe has reverted my effort to improve the first sentence, someone needs to go into it and fix the style problem I identified earlier on this talk page. Wikipedia articles are not supposed to start out In statistics, the term central tendency relates to the way in which.... They are supposed to start out In statistics, a probability distribution's central tendency is.... The fact that the article has long used "relates to" instead of "is" may be due to Dingo1729's observation that "Central tendency is vague and open to interpretation", and thus is hard to define. Duoduoduo (talk) 21:28, 20 April 2013 (UTC)

Needs to answer these two questions[edit]

Currently the article just mentions the idea that a central tendency may or may not exist for a distribution:

If a central tendency does not exist for a population, its distribution may be bimodal, multimodal or U-shaped.

But after the lead, the three main sections and sub-sections are "Measures of central tendency", "Measures of location", and "Measures of spread". Melcombe argues above that the article should keep its current title "Central tendency" rather than "Measure of central tendency". If that view prevails, then the article really needs to answer these two questions:

  • How do we determine whether a distribution has a central tendency?
  • If we determine that it does indeed have one, how do we determine which choice or choices of central tendency are valid to use for this distribution? Duoduoduo (talk) 21:40, 20 April 2013 (UTC)

Two meanings[edit]

There are two meanings of central tendency or statistical centrality sourced in the literature. Narrow-sense centrality is synonymous with center, representative, or typical value. It is specified in terms of a location parameter (the reverse is not necessarily true). As such it is also synonymous with average and can thus be realized by any of its variants: mean, median, mode, etc. The wide-sense meaning, arguably closer to the dictionary definition of tendency and centrality, encompasses not only the centrality location but also the degree of centrality, i.e., the statistical dispersion. (I hope this is acceptable under WP:SYNTHESIS.) Fgnievinski (talk) 07:15, 24 April 2013 (UTC)

References for narrow-sense meaning: [2] [3] [4] [5] [6] [7] Cambridge Dictionary of Statistics: "Central tendency - A property of the distribution of a variable usually measured by statistics such as the mean, median and mode."

References for wide-sense meaning: [8] [9] [10] [11] Oxford Dictionary of Statistics: "central tendency (centrality) - The tendency of quantitative data to cluster around some central value. The central value is commonly estimated by the mean, median, or mode, whereas the closeness with which the values surround the central value is commonly quantified using the standard deviation or variance."

Thanks for the references. My interpretation, looking at some other references too, is that the "narrow-sense" is the primary meaning and that the other meaning seems to occur in some encyclopedias but not commonly otherwise. Also "measures of central tendency" is much more frequent than "central tendency" in isolation. I also looked up "Tendance centrale" in the french wikipedia, since the OECD reference mentions the french phrase. My french is rusty, but I think they only acknowledge the narrow meaning. Dingo1729 (talk) 16:42, 24 April 2013 (UTC)
I agree with Dingo. While I don't particularly like the word "average" since for some people it is a synonym for "mean" and for others it means "mean, median, or mode", over the past forty years I have seldom heard of the broad sense encompassing dispersion. I've occasionally heard the term "cluster around" but it has always seemed like an informal freshman oversimplification.
I think it would be a good idea to start the article with something like what I put in a few days ago, or an improved version of it maybe quoted from one of the sources, with citations, and then mention that sometimes the term is used in the broader sense that more nearly matches the common word "tendency", again with citations.
As for the observation that "measure of central tendency" -- in the sense of mean, median, or mode but not in the sense of dispersion -- is much more common than "central tendency" by itself, this is certainly very true. But I think it would be a mistake to rename the article "Measure of central tendency" because we already have the article Location parameter. Duoduoduo (talk) 17:27, 24 April 2013 (UTC)

I'm not happy with the current version of the "wide sense" meaning in the article. The definitions from the sources are all very similar. Here's one:

"The tendency of quantitative data to cluster around some random variable value. The position of the central value is usually determined by one of the measures of location such as the mean , median or mode. The closeness with which values cluster round the central value is measured by one of the measures of dispersion such as the mean deviation or standard deviation."

I think that the only part of this that can be salvaged is the first sentence. I don't read the definition as saying that a central tendency consists of two numbers, a position and a spread. Also, if we're thinking of "central tendency" in this way, I don't believe the actual value of the standard deviation tells us anything. Surely a normal distribution has the same amount of "central tendency" independent of a scale factor? I'm assuming that this is in contrast to, say, a Bernoulli distribution which, so far as I can tell, this definition would say doesn't have a central tendency. If we really want to distinguish between situations like this we'd have to look at something like the coefficient of kurtosis. But that's far into WP:OR.

So I'm changing this part of the leading paragraph again. Dingo1729 (talk) 02:39, 2 May 2013 (UTC)

The quote has three sentences: (1) "The tendency..."; (2) "The position..."; (3) "The closeness..." Rejecting (3) is WP:CHERRYPICKING IMHO. And just stating (1) -- "central tendency means tendency to cluster around center" -- is [circular definition]. This sourced definition does affirm that a normal distribution with smaller standard deviation is said to exhibit a greater degree of central tendency. As for the Bernoulli distribution, please find a source. Fgnievinski (talk) 06:54, 2 May 2013 (UTC)
No, I don't have anything like a source for this definition and the Bernoulli distribution. We don't have any sources for a practical use of it do we? I'm just trying to understand the definition. At least in previous iterations here it was implied that some distributions have a central tendency and others don't. The Bernoulli distribution is the most extreme example I can think of. Surely we aren't meant to say that a normal distribution has a central tendency if it has standard deviation 1 but not if it has standard ddeviation 20. It seems to me that you were trying to make sense of a bad definition by paraphrasing it and I was trying to cut it down to the minimum that makes sense. Though I agree there are problems with it being circular. But I'd blame the person who wrote it rather than me. I guess I would be OK with including a full exact quote from one of the sources, though it's a bit WP:UNDUE since no-one other than lexicographers seem to use this meaning. Dingo1729 (talk) 14:08, 2 May 2013 (UTC)
I read the three-sentence definition like this: A tendency is something well, tending, toward someplace more or less strongly. A tendency can be measured by where it is tending toward and how strongly it is tending toward there. The implication is that, for example, a low-variance normal has a strong tendency to have values near a particular location, while a high-variance normal has a weak tendency to have values near a particular location. Duoduoduo (talk) 16:55, 2 May 2013 (UTC)
Searching googlescholar, I found examples of "central tendency" being used to mean the location of the central tendency -- i.e., the location measure of the central tendency without using the word "measure": "We assume that the instantaneous riskless rate reverts toward a central tendency which....". "It helps us identify the central tendency of the distribution (called the expected value) and the amount of variability or spread of the distribution (called the variance)." "Many accounts of categorization equate goodness-of-example with central tendency for common taxonomic categories." "...rather than inferring such relationships from models based on conditional central tendency." "...examines the effects of comparative context on central tendency and variability judgements of groups,...." "The Kerala model: its central tendency and the outlier." One paper defines central tendency as the "degree to which units of a distribution tend to cluster around a given point" but then ignores the degree in favor of the location. Basically it's very hard to find any papers that actually define it as "the degree to which" and then actually stick with that definition. Duoduoduo (talk) 17:20, 2 May 2013 (UTC)
Earlier Melcombe was arguing that "Central tendency" is a property of a distribution "the first consideration here is "does this population have a central tendency"". He interpreted the definition to mean that, for example, a bimodal or multimodal distribution doesn't have a central tendency. I think that all this is implied though not stated by the first sentence. On the other hand, Fgnievinski and Duoduoduo seem to be saying that we can't say anything about a single distribution, we can only compare two distributions (which must be on the same scale) and say that one has more of a central tendency than the other. I agree that this is implied though not stated by the third sentence. Overall I come down on Melcombe's side, though I think either meaning is possible. Which is precisely why I think it's a bad definition. So what should we write in the article? I wrote what I did to try to avoid getting any deeper into the interpretation, but I'm hearing disagreement. Is it important or irrelevant that a distribution is bimodal when talking about a central tendency? Dingo1729 (talk) 17:54, 2 May 2013 (UTC)
I don't see the conflict of interpretations. Melcombe also said from his sources that the standard deviation inversely measures the strength of the central tendency (he had a whole subsection on measures of dispersion), which my interpretation agreed with.
In any event, I think the current text is fine in this regard. In the three-sentence definition/description above on this talk page, I think only the first sentence is the definition; the second and third sentences say how to measure the concept defined in the first sentence. Duoduoduo (talk) 18:15, 2 May 2013 (UTC)
I agree with Duoduoduo's paraphrasing about the distinction between weak vs. strong centrality. I also concede to Dingo1729's point that the third sentence only implies this point. Yet I disagree that a direct quotation of the first sentence suffices -- the meaning is not trivial so it has to be spelled out, as the discussion here attests. Melcombe's interpretation, that centrality is an all-or-nothing property, is not supported by the sources. What is sourced is that wide-sense centrality can always be defined because mean and standard deviation can always be calculated, although e.g. a bimodal distribution will exhibit a poor degree of centrality. Fgnievinski (talk) 05:38, 3 May 2013 (UTC)
I think I was wrong to write that Melcombe was saying that it's an all-or-nothing property. Of course there would be intermediate cases.
I've found a couple of sources [12] [13] which I think support the view that what matters is the shape of the distribution rather than the standard deviation. They come from this search [14]. There may be some other sources in there which emphasize standard deviation rather than shape but I haven't found them yet. Dingo1729 (talk) 02:29, 4 May 2013 (UTC)
This would be a third definition. I'll stick with the first two definitions -- narrow and wide. Since these seem to have been agreed upon, I've restored the original edit in which the latter had been introduced. Fgnievinski (talk) 07:12, 6 May 2013 (UTC)
I'm not sure why you say that this has been agreed upon. By my reading the two new sources are consistent with the sourced definitions. Duoduoduo's opinion was that "I think only the first sentence is the definition", which I took to mean that that is what we should use as the definition in the article. Perhaps he could help us here, I don't want to put words in his mouth. The "narrow" and "wide" terminology is just what we're using here, not a sourced use is it? Dingo1729 (talk) 13:44, 6 May 2013 (UTC)
I think I may have mis-represented one of the sources above [15]. Reading it more carefully, it may actually support the use of dispersion rather than shape. But the source may not be useable because the author seems not to know the meaning of kurtosis. Dingo1729 (talk) 17:34, 6 May 2013 (UTC)

More on two meanings[edit]

The dictionary entry under discussion is

The tendency of quantitative data to cluster around some random variable value. The position of the central value is usually determined by one of the measures of location such as the mean , median or mode. The closeness with which values cluster round the central value is measured by one of the measures of dispersion such as the mean deviation or standard deviation.

One proposed wording for the article's lede is

Some statistical dictionaries define central tendency (or centrality), to mean "The tendency of quantitative data to cluster around some central value."

The other proposed wording is

Some statistical dictionaries adopt a wider definition of central tendency (or centrality), to include not only the central location but also the degree of centrality, i.e., the statistical dispersion.

Both of these seem to be to be accurate renditions of what the source says. I suggest this compromise wording:

Some statistical dictionaries define central tendency (or centrality), to mean "the tendency of quantitative data to cluster around some central value," with this tendency measured by both the central location and the degree of centrality, i.e., the statistical dispersion.

Duoduoduo (talk) 15:55, 6 May 2013 (UTC)

I think I would be OK with that, though it doesn't cover the two new sources with their meaning of shape rather than dispersion. At the risk of muddying the waters, would moving the sentence into a separate paragraph work?

Secondary Meaning[edit]

Occasionally authors use central tendency (or centrality), to mean "the tendency of quantitative data to cluster around some central value,". [1][2] This meaning might be expected from the usual dictionary definitions of the words tendency and centrality. The authors may judge whether data has a strong or a weak central tendency based on either the dispersion or the shape of the distribution. Dispersion would be measured by the standard deviation or something similar. The shape would be judged on whether it is unimodal as opposed to bimodal, multimodal or scattered.

Dingo1729 (talk) 17:27, 6 May 2013 (UTC)

Since the shape idea was retracted, let me modify the above as
Occasionally authors use central tendency (or centrality), to mean "the tendency of quantitative data to cluster around some central value,". [1][2] This meaning might be expected from the usual dictionary definitions of the words tendency and centrality. Those authors may judge whether data has a strong or a weak central tendency based on the statistical dispersion, as measured by the standard deviation or something similar.
Here I (1) removed mention of shape, (2) put a wikilink onto statistical dispersion, and (3) removed the bolding of "strong" and "weak" because they're not part of the title term. Duoduoduo (talk) 17:53, 6 May 2013 (UTC)
Agreed. PLease go ahead. Thanks. Fgnievinski (talk) 18:11, 6 May 2013 (UTC)
Yes, That looks good. On a minor point, I only retracted one of the two references for "shape". The other [16] (Foundations and applications of statistics: an introduction using R, p.9) still looks solid for the "shape" meaning. But I'm not going to push that unless I find some more sources. Thanks, Dingo1729 (talk) 01:42, 7 May 2013 (UTC)
Thank you everyone for the civility exhibited here despite the disagreements. Fgnievinski (talk) 04:36, 7 May 2013 (UTC)

Start-class[edit]

This article seems to have moved beyond being a stub; I'm re-rating it as start-class, if no one objects. -Bryanrutherford0 (talk) 20:19, 22 July 2013 (UTC)

Quadratic Mean[edit]

An IP has added the quadratic mean (rms) to the list. Is it ever really used as a measure of central tendency? The biggest problem comes when there are negative values, so then it isn't anywhere near the center of the distribution. I'm somewhat inclined to revert, but I thought I'd ask for opinions first. I've mentioned this on the IP's talk page. Dingo1729 (talk) 16:12, 9 February 2014 (UTC)

No one else seems interested so I re-wrote that entry. Dingo1729 (talk) 05:13, 16 February 2014 (UTC)