# Clustering illusion

The clustering illusion refers to the tendency to erroneously perceive small samples from random distributions to have significant "streaks" or "clusters", caused by a human tendency to underpredict the amount of variability likely to appear in a small sample of random or semi-random data due to chance.[1]

Thomas Gilovich found that most people thought that the sequence OXXXOXXXOXXOOOXOOXXOO[2] looked non-random, when, in fact, it has several characteristics maximally probable for a pseudorandom stream, such as an equal number of each result ($P(O) = P(X)$) and an equal number of adjacent results with the same outcome for both possible outcomes ($P(O|X) = P(X|O) = P(X|X) = P(O|O)$). In sequences like this, people seem to expect to see a greater number of alternations than one would predict statistically. The probability of an alternation in a sequence of independent random binary events is 0.5, yet people seem to expect an alternation rate of about 0.7.[3][4][5] In fact, in a short number of trials, variability and non-random-looking "streaks" are quite probable.

Daniel Kahneman and Amos Tversky explained this kind of misprediction as being caused by the representativeness heuristic[4] (which itself they also first proposed). Gilovich argues that a similar effect occurs for other types of random dispersions, including 2-dimensional data such as seeing clusters in the locations of impact of V-1 flying bombs on London during World War II or seeing streaks in stock market price fluctuations over time.[1][4] Although Londoners developed specific theories about the pattern of impacts within London, in a statistical analysis by R. D. Clarke originally published in 1946 the impacts of V-2 rockets on London is a close fit to the Poisson distribution, meaning it closely resembles the expected result from a chance dispersion.[6][7][8][9][10] This analysis was a plot point in Thomas Pynchon's novel Gravity's Rainbow.

The clustering illusion is central to the "hot hand fallacy", the first study of which was reported by Gilovich, Robert Vallone and Amos Tversky. They found that the idea that basketball players shoot successfully in "streaks", sometimes called by sportcasters as having a "hot hand" and widely believed by Gilovich et al.'s subjects, was false. In the data they collected, if anything the success of a previous throw very slightly predicted a subsequent miss rather than another success.[5]

A study in 2008 by Jennifer Whitson and Adam Galinsky found that subjects were more likely to report meaningful clusters in semi-random pictures after they had been primed to feel out-of-control, or had been induced to reminisce about an experience where they felt out of control.[10][11][12]

Using this cognitive bias in causal reasoning may result in the Texas sharpshooter fallacy. More general forms of erroneous pattern recognition are pareidolia and apophenia. Related biases are the illusion of control which the clustering illusion could contribute to, and insensitivity to sample size in which people don't expect greater variation in smaller samples. A different cognitive bias involving misunderstanding of chance streams is the gambler's fallacy.

