# Talk:Mode (statistics)

WikiProject Mathematics (Rated B-class, Mid-importance)
This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of Mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Mathematics rating:
 B Class
 Mid Importance
Field: Probability and statistics
One of the 500 most frequently viewed mathematics articles.
WikiProject Statistics (Rated B-class, Top-importance)

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

B  This article has been rated as B-Class on the quality scale.
Top  This article has been rated as Top-importance on the importance scale.

## If the highest frequency is one...

Question: Let's say the set is 1,2,3,4.

Is the mode all of them, since the highest frequency is one, or is there no mode whatsoever?

I brought this to question since the definition uses "frequency," and it would appear that they are all equally frequent and thus all modes? I'm not sure of the official accepted interpretation.

The mode cannot be calculated in such a varied set of data and thus should not be used. - Ferret 03:27, 23 March 2006 (UTC)
This was a multiple choice question on a standardized test, so would I say 4 modes or no modes?--Dch111 21:02, 16 May 2006 (UTC)
No modes, according to my programming teacher who worked for NASA. —The preceding unsigned comment was added by 75.8.96.7 (talk) 04:42, 11 December 2006 (UTC).
I got a proublem from my math class which implies there is a mode for a set of 5 numbers. (It's a hard proublem to explain so I won't right it out.) Lophoole 01:00, 26 January 2007 (UTC)Lophoole

## Commute with

Do we really need to say commute with when discussing the linearity of mean, mode, and median? It does mean what we want to say, but it will convey it only to readers who don't need this article. Septentrionalis 21:23, 5 May 2006 (UTC)

In both places where it is used, it is immediately explained. Doesn't that work? I can't judge, because I don't need this article. LambiamTalk 01:28, 6 May 2006 (UTC)
That's the problem. Nobody who is editing it does. (And linear or independent might be just as bad.) Septentrionalis 01:51, 6 May 2006 (UTC)
I have reverted the addition of a wikilink to Commutative operation because it does not really cover the concept meant here. Commutative diagram is more to the point, but even I :) can see that that is a bridge too far. LambiamTalk 02:13, 6 May 2006 (UTC)
Well it would be nice if commutative operation really brought out the idea of commutative diagrams; it is in there, it's just not going to be obvious to the audience here.... Septentrionalis 03:25, 6 May 2006 (UTC)
The way I see it is as follows. For "commutative" in the sense of commutative diagram, reducing everything to the simplest case, operations f and g commute if:
f o g = g o f.
There is no universal quantification here, and the commuting operations are a pair of unary operations (although generalizations are possible and usual).
For "commutative" in the sense of commutative operation, operation ⊕ is commutative if,
xy = yx for all x and y in the domain.
Here the universal quantification is essential, and there is one operation which is binary.
I expect that attempts to unify the two notions will produced strained and non-natural results. LambiamTalk 18:51, 6 May 2006 (UTC)
Rather than try to explain the usage, I kept the idea. See what you think; I wouldn't put Commutative diagram under the See also, but go ahead if you like. Perhaps a general article on Commutativity? Septentrionalis 01:58, 7 May 2006 (UTC)

## Sample from a continuous distribution

I don't understand why in "a sample from a continuous distribution [...] each value will occur precisely once." I'm not particularly familiar with statistics (I didn't know what a 'mode' was before reading this article), so this may just reflect confusion on my part, but I don't see why sample from a continuous distribution will contain every element only once (although it seems to me that it would do so with probabililty 1, i.e., almost certainly). Could someone explain? Benja 13:23, 15 May 2006 (UTC)

Indeed, more precisely formulated, the probability of any duplicate elements is 0. So if you were to repeat this for the rest of the life of the universe, you'd never expect a duplicate. Isn't that good enough for "will occur precisely once"? More strongly, though, the values have to be represented somehow, and one imaginable way is a random-value producing apparatus that keeps producing a stream of digits making up the decimal expansion of the values for the random variables. So imagine that after several eons of steady production we have that one stream is "0.7785[seventeen umptyzillion and four digits omitted]459(to be continued)" and so is another. Are the two values duplicates of each other? Perhaps, but most likely not. They may diverge at every next digit. But even if they are, we'll never know, since the streams will never be complete. So actually, the probability of any duplicate elements is 0, and, moreover, even if duplicacy (is that an English word?) occurs, it remains forever unknowable. Might as well say: can't happen. --LambiamTalk 13:50, 15 May 2006 (UTC)
Huh. Seems like I understood alright, but you guys have a seriously different way of looking at these things. :-) (I was assuming that we were using the language of theory, since continuous distributions are a theoretical/modelling tool rather than something you'd deal with in practice.) Although, if we look at this from a practical point of view, wouldn't it be more helpful to talk about the practical situation where you get, say, temperature readings with +/- .01 degrees accuracy and it's quite possible, but unlikely, that you'll get the same reading twice, rendering the mode meaningless and requiring the technique described? -- In any case, thanks for explaining! Benja 14:30, 15 May 2006 (UTC)
"You guys"? I'm just one editor with one viewpoint (who happened to have written the relevant paragraph). The above was all very theoretical and only thought experimental; actual random-value producing apparatus doesn't last that long. For practical application, the article tells you to discretize the data, as for a histogram, so that you will actually get (still in practice) frequencies greather than 1. --LambiamTalk

observe the following data 1,3,4,5,7,9,12,13,15,15,15,15,15,16,17,19,22,24,24,25,25,26,26,27,30,30,31,32,32,33,33,34,35,35,36,36,37,,39,40,40,45,45,46,47,48,49,49. class intervals freq. 0-10 6 10-20 10 20-30 8 30-40 14 40-50 9 according to the defination of the mode, 30-40 is the modal class. and the mode will lie in that interval. But if we look at the raw data then 15 is the mode. how do we resolve this contradiction ? Reeded 15:11, 28 March 2007 (UTC)reeded

If the population distribution is continuous, the chance of observing two or more identical values is 0. If you discretize the data, then the result may (of course) depend on the details. If you have a distribution whose density function has a thin spike, it gets diluted when the intervals are much wider. The sample data show a thin spike at 15 and a robust mound around 35. The choice of intervals does actually not conform to the advice given in the article, but it is also possible to construct an example that does conform but still displays this phenomenon.  --LambiamTalk 23:01, 28 March 2007 (UTC)

Thanks for that! another hiccup! what happens if the data is given as a discontinuous data. do we make the intervals continuous, if so why? 59.180.85.146 06:30, 8 April 2007 (UTC)reeded13

Do you mean contiguous intervals? And isn't the data (if finite in size) always "discontinuous"? If the data comes from a discrete probability distribution, such as the Poisson distribution, then usually you don't use intervals but just tally the values: 3× a 0, 3× a 1, 4× a 2, 9× a 3, 5× a 4, 1× a 5, 2× a 6. If the frequencies are very low, you could lump groups of adjacent values together, making sure the groups have equal sizes. If the underlying distribution is continuous, like for the gamma distribution, then you must use intervals, and these intervals should partition the sample space into equi-sized parts. These intervals must be contiguous: together they must cover the whole space without gaps.  --LambiamTalk 12:27, 8 April 2007 (UTC)

thanks for that!59.180.55.151 15:32, 11 April 2007 (UTC)reeded

## I have a similar question to the first one above

My question is I have a set of 10 numbers with 44 and 93 appearing twice. Is the mode "44 and 93" or is it the average of the two, as calculated with "median" involving even numbers?

Thanks! LandOfIsrael 18:26, 15 August 2006 (UTC)

The article gives the example of a data sample [1, 1, 2, 4, 4] and states: "the mode is not unique". That is about all that can be said about it. Depending on your needs, predilections, and local customs, you can pick your choice between: (a) for this sample the mode is undefined; (b) this sample actually has two modes: 1 and 4; and (c) the mode is indeterminate (whatever that means). I would not take the average except when you have a histogram peaking in two adjacent slots, because that could be severely misleading. Now here is a question: what is the mode of an empty data sample? --LambiamTalk 01:22, 16 August 2006 (UTC)
Thank you so much!!! I appreciate your taking the time to help me. I didn't see that example in the article even though it was staring me in the face. Oops! :) In this case I think I'll write that the mode is "44 and 93" and see how that works. Again, thank you!!! (Thanks as well to whoever moved my question to the bottom -- I wasn't aware of the wiki custom of oldest on the top/newest at the bottom!) LandOfIsrael 10:52, 16 August 2006 (UTC)

## Computing the mode

Hi. I would like to see a section on how to effectively compute the mode of a list of numbers. That would be useful. --Spoon! 02:03, 9 February 2007 (UTC)

Compute the mode by three successive passes through the data list thus:

• Pass 1: Count the number of different sample values

Start a separate count variable = 0 for each sample value and do

• Pass 2: Increment the appropriate count for each sample

Start a frequency variable = 0 and do

• Pass 3: For each value, if its occurence is greater than the frequency variable, then update the frequency variable and remember which value did it.

At the end of Pass 3 the last remembered (i.e. stored) value is the mode. (With a little more thought, Pass 1 and 2 can be achieved in a single pass.) Cuddlyable3 07:48, 8 May 2007 (UTC)

This requires that you maintain a data structure for counting, for each possible sample value, its frequency. In general, that has to be an associative array (lookup table). For example, if the data is something like {2404030, 712, 20574, 2404030, 657032180, 20574, 712, 2404030, 712, 20574, 2404030, 657032180, 657032180, 2404030, 31710510, 657032180, 31710510, 20574, 712, 712}, you don't want to allocate a normal array for this (and you cannot know the required size in advance). Unless you have a set of library functions for this, that is not an easy programming exercise. If all values are different, the size grows to be as large as the data sample. It is almost always easier and more efficient to sort the values, after which you can find the mode in a single pass. Here is a way of doing that, presented in pseudocode:
  (maxCnt, loMode, hiMode) := (0, NAN, NAN);

  (cnt, last) := (0, NAN);
for x ← sorted(data):
if x ≠ last:
if cnt = 0:
skip;
else if cnt > maxCnt:
(maxCnt, loMode, hiMode) := (cnt, last, last);
else if cnt = maxCnt:
if last < loMode:
loMode := last;
end if
if last > hiMode:
hiMode := last;
end if
else:
skip;
end if
(cnt, last) := (1, x);
else:
cnt := cnt+1;
end if
end for

  if cnt = 0:
skip;
else if cnt > maxCnt:
(maxCnt, loMode, hiMode) := (cnt, last, last);
else if cnt = maxCnt:
if last < loMode:
loMode := last;
end if
if last > hiMode:
hiMode := last;
end if
else:
skip;
end if

The constant NAN is supposed to be a value that is different from all possible data values (and always compares as different). At the end, the variable maxCnt contains the highest frequency observed. The result is delivered in a pair of program variables: loMode for the lowest data value having maxCnt occurrences, and hiMode for the highest data value having maxCnt occurrences. If the data is unimodal, these two will be the same. If it is possible to append an "infinite" value to the data (or any value exceeding all original data values), the last block can (and should) be omitted.
A problem with including this algorithm in the text of the article is that this is "original research".  --LambiamTalk 15:38, 8 May 2007 (UTC)
Before computing the mode of a series of values it is probable wise to see if the distribution is unimodal. The dip test can be used for this purpose. If it is not unimodal this may complicate the programming somewhat. Determining the number of modes in an arbitrary list of numbers if it is not unimodal is difficult. It may be impossible but I am not sure if this is a known fact. DrMicro (talk) 13:35, 7 July 2012 (UTC)

## Infinite mode

Can someone give an example of "Furthermore, like the mean, the mode of a probability distribution can be (plus or minus) infinity"? --Rumping 00:07, 7 September 2007 (UTC)

I've replaced the sentence by one that seems more true.  --Lambiam 12:28, 8 September 2007 (UTC)
Thought not. Thanks --Rumping 11:56, 10 September 2007 (UTC)
Are you trying to say that the distribution associated with the Cantor function does have a defined mode?  --Lambiam 14:29, 10 September 2007 (UTC)
Not at all - just that I originally doubted whether there was a dististibution on the reals with an infinite mode.
As to your question, it is an interesting point; I think it might be possible to say that the modes of the Cantor distribution are the members of the Cantor set. See what you think of the following singular distribution: imagine a random variable X on [0,1] with p(x)=2x. X obviously has a mode at x=1. Then define a random variable Y determined by X by (a) writing X as a binary fraction [with 1 as 0.1111...], (b) changing all the '1' digits to '2' digits, and (c) reading the result as a ternary fraction. This is a monotonic transformation so the mode of Y should be at 0.2222...(base 3), i.e. at 1. I have not convinced myself and I won't press the point. --Rumping 16:10, 10 September 2007 (UTC)
It is indeed an interesting point, and it crossed my mind that the Cantor set could be said to be the mode. On the other hand, the somewhat loose definition given for mode ("the most frequent value assumed by a random variable") is not designed for such pathological cases, and it would be stressing its applicability beyond its scope. I see that the anonymous editor who set up the "Probability distribution" box at Cantor distribution agreed, giving "Mode: n/a". For the Rumping distribution, the answer 1 seems rather reasonable; in an experimental process, the experimentator would definitely be led to hypothesize this, and in the limit, using appropriately finer and finer discretizations as the number of trials grows, the modes of the discretized distributions will tend to 1.  --Lambiam 23:02, 10 September 2007 (UTC)

## Comparison of mean, median and mode: move to "average"?

Does anyone else think this section would fit better in average ? I propose moving it there with a link to it from here. Let me know what you think. Qwfp (talk) 14:50, 22 February 2008 (UTC)

In my opinion, Average should really be rewritten to cover mainly the three well-known measures of central tendency, arithmetic mean, median and mode (with "See also"s for Running average aka Moving average and Weighted average aka Weighted mean), and then this section can be moved to the rewritten article Average. The current content of Average – inasmuch as it is not OR – should be merged into Mean. Unfortunately, we did not reach consensus on this over at Talk:Average, where it was even argued that mentioning the concept of statistical mode might harm laypeople who would not know how to use it in a safe way.  --Lambiam 11:13, 23 February 2008 (UTC)

I think it belongs in a section on a central tenancy article. Central tendency currently redirects to average, which I think is a mistake because average is just one example of a measure of central tendency. Comparison logically belongs there. Jason Quinn (talk) 01:34, 21 July 2008 (UTC)

Also, I should mention that anybody searching for "central tendency" is likely a technical person not interested in most of the average article. They are interested in what all the different measures are to apply in some application they are building. I will in the next week or so break the redirect and start a new article if nobody persuades me otherwise. Jason Quinn (talk) 01:38, 21 July 2008 (UTC)
I don't like the current average article at all. I think it should focus much more on just the arithmetic mean and the mean and central tendency pages should discuss the rest. Jason Quinn (talk) 01:42, 21 July 2008 (UTC)
The article Central tendency redirects to Average because it was merged with that article. As used in statistics, average and measure of central tendency are synonyms. The mean, the median and the mode are all different kinds of averages. As I wrote on the talk page of Average:[1]
Just do a Google search on ["measures of central tendency"]. The first hit: "This section defines the three most common measures of central tendency: the mean, the median, and the mode."[2] The next: "Measures of central tendency—mean, median, and mode—can help you capture, with a single number, what is typical of the data."[3] And so on. The search term ["measures of average"] gives similar results:  --Lambiam 06:26, 27 November 2007 (UTC)
P.S. And here is a quote from the intro paragraph of our own article Mean: "It is sometimes stated that the 'mean' means average. This is incorrect if "mean" is taken in the specific sense of "arithmetic mean" as there are different types of averages: the mean, median, and mode. For instance, average house prices almost always use the median value for the average."  --Lambiam 06:31, 27 November 2007 (UTC)
It is fine if arithmetic mean gets more attention than other kinds of mean and than median and mode, because it is the most common kind of average, but it would be a mistake to leave the others out. But since each already has an article, the treatment in Average can be relatively brief.  --Lambiam 21:24, 26 July 2008 (UTC)

On a more practical bent, elementary school texts are discussing the three concepts, "mode," "median," and "mean" together without even mentioning the "average" concept. What drew me to the mode article (and to the section on mode, mean, and median) was a basic exercise from my daughter's sixth grade math text that elicits answers regarding these three particular concepts. The exercise in her homework does not discuss the concept of "average" or use "mean" interchangeably with the former term. Additionally, from a layman's perspective, the concepts of "mean" or "average", "mode," and "median" are distinct concepts, not just different flavors of "average," and the most common use of "average" does not encompass either "mode" or "median". Otherwise, there would be confusion as to what, for example, the "average age" of a given population group was (i.e., is it the "mode," "median," or "mean"?). A more in-depth treatment of the subtle differences between various "tendencies" is best addressed elsewhere; the topic's breath is somewhat greater than what most people will be searching for. —Preceding unsigned comment added by 71.34.165.94 (talk) 02:44, 26 August 2008 (UTC)

## Better Data Set for the Comparison of common averages table

I find it unfortunate that this table uses a different data set for each example, and that the clarity and usefulness would be greatly improved if the same data set was used for each entry. I would like to edit it with the following data set: 1, 2, 2, 3, 4, 7, 9. This would give a mean of 4, a median of 3, and a mode of 2. --Pdcurry (talk) 23:31, 22 December 2009 (UTC)

## Criticism of the mode

Why isn't there a criticism section??? —Preceding unsigned comment added by 129.98.192.217 (talk) 03:01, 25 October 2010 (UTC)

## Probably an error

In the Wikipedia article we read "For samples, if it is known that they are drawn from a symmetric distribution, the sample mean can be used as an estimate of the population mode." I propose to change to: "For samples, if it is known that they are drawn from a symmetric unimodal distribution, the sample mean can be used as an estimate of the population mode." Otherwise, it fails for example for values 1, 2, 2, 2, 3, 4, 5, 5, 5, 6, having mean 3.5 which is not a good estimate of the two modes 2 and 5. — Preceding unsigned comment added by 140.105.52.132 (talk) 22:22, 21 March 2012 (UTC)

## Unimodal distributions

I texified the "Unimodal distributions" section under median, and found this similar section on this article. However, the two contradict each other. The media article claims that the difference between the mean and median are bounded by $\sigma (3/5)^{1/2}$, whereas this article claims that it is $\sigma 3^{1/2}$.

Which one is correct?

Tebello TheWHAT!!?? 21:34, 18 March 2012 (UTC)

The correct statement is that the difference between the mean and the MEDIAN is $\sigma (3/5)^{1/2}$ and that the difference between the mean and the MODE is $\sigma 3^{1/2}$. This is what is written in the articles.DrMicro (talk) 13:31, 7 July 2012 (UTC)