Jump to content

Compact letter display

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Sympa (talk | contribs) at 20:14, 3 September 2022 (Developing the content). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Compact Letter Display (CLD) is a statistical method to clarify the output of multiple hypothesis testing conducted by ANOVA and Tukey's range tests. CLD facilitates the identification of variables, or factors, that have statistically different means vs. the ones that do not have statistically different means. CLD also ranks variables by their respective means in descending order. And, it identifies the group of variables (two or more) that do not have statistically different means. The CLD methodology can be applied to both nominal or tabular type data or visual data.

The basics of CLD

CLD identifies the variables that are statistically different vs. the ones that are not

Each variable that shares a mean that is not statistically different from another one will share the same letter.  For example:

”a” “ab” “b”

The above indicates that the first variable “a” has a mean (or average) that is statistically different from the third one “b”.  But, the second variable “ab” has a mean that is not statistically different from either the first or third variable. Let's look at another example:

”a” “ab” “bc” “c”

The above indicates that the first variable “a” has a mean that is statistically different from the third variable “bc” and the fourth one “c”.  But, this first variable “a” is not statistically different from the second one “ab”.

Given the structure of the Roman alphabet, the CLD methodology could readily compare up to 26 different variables, or factors, with each of them having a statistically different mean from all the others. This constraint is typically much higher than the vast majority of multiple hypothesis testing conducted using ANOVA and Tukey's range tests.

CLD ranks the variables in descending mean order

So, the variable with the highest mean will be named “a” (if it is statistically different from all the others, otherwise it may be called "ab", etc.).  And, the variable with the lowest mean will have the highest letter among the tested variables.

A CLD example

We are going to test if the average rainfall in five West Coast cities is statistically different.  These cities are:

Eugene (OR)

Portland (OR)

San Francisco (CA)

Seattle (WA)

Spokane (WA)

The data is annual rainfall (1951 – 2021).  The data source is NOAA.

First, we will improve the tabular data using CLD. Next, we will improve the visual data using CLD.

Improving tabular data with CLD

pppp

Improving visual data with CLD

poiohoionlnln

The benefits of CLD