Dot plot (statistics)

From Wikipedia, the free encyclopedia
Jump to: navigation, search

A dot chart or dot plot is a statistical chart consisting of data points plotted on a fairly simple scale, typically using filled in circles. There are two common, yet very different, versions of the dot chart. The first is described by Leland Wilkinson as a graph that has been used in hand-drawn (pre-computer era) graphs to depict distributions.[1] The other version is described by William Cleveland as an alternative to the bar chart, in which dots are used to depict the quantitative values (e.g. counts) associated with categorical variables.[2]

Wilkinson dot plots[edit]

A dot plot, as described by Wilkinson, of 50 random values from 0 to 9.

The dot plot as a representation of a distribution consists pico of group of data points plotted on a simple scale. Dot plots are used for continuous, quantitative, univariate data. Data points may be labelled if there are few of them.

Dot plots are one of the simplest statistical plots, and are suitable for small to moderate sized data sets. They are useful for highlighting clusters and gaps, as well as outliers. Their other advantage is the conservation of numerical information. When dealing with larger data sets (around 20–30 or more data points) the related stemplot, box plot or histogram may be more efficient, as dot plots may become too cluttered after this point. Dot plots may be distinguished from histograms in that dots are not spaced uniformly along the horizontal axis.

Although the plot appears to be simple, its computation and the statistical theory underlying it are not simple. The algorithm for computing a dot plot is closely related to kernel density estimation. The size chosen for the dots affects the appearance of the plot. Choice of dot size is equivalent to choosing the bandwidth for a kernel density estimate.

Cleveland dot plots[edit]

Dot plot may also refer to plots of points that each belong to one of several categories. They are an alternative to bar charts or pie charts, and look somewhat like a horizontal bar chart where the bars are replaced by a dots at the values associated with each category. Compared to (vertical) bar charts and pie charts, Cleveland argues that dot plots allow more accurate interpretation of the graph by readers by making the labels easier to read, reducing non-data ink (or graph clutter) and supporting table look-up.

In the R programming language this type of plot is also referred to as a stripchart[3] or stripplot.[4]

References[edit]

  1. ^ Wilkinson, Leland (1999). "Dot plots". The American Statistician (American Statistical Association) 53 (3): 276–281. doi:10.2307/2686111. JSTOR 2686111. 
  2. ^ Cleveland, William S. (1993). Visualizing Data. Hobart Press. ISBN 0-9634884-0-6. hdl:2027/mdp.39015026891187. 
  3. ^ Peter Dalgaard. Introductory Statistics with R. Springer. ISBN 0-387-95475-9. 
  4. ^ Paul Murrell (2005). R Graphics. Chapman & Hall/CRC. ISBN 1-58488-486-X. 

Other references[edit]

  • Wild, C. and Seber, G. (2000) Chance Encounters: A First Course in Data Analysis and Inference John Wiley and Sons. ISBN 0-471-32936-3

External links[edit]