Likert scale: Difference between revisions

Content deleted Content added

Inline

Revision as of 05:19, 22 March 2013

A Likert scale (/[invalid input: 'icon']ˈlɪkərt/^[1]) is a psychometric scale commonly involved in research that employs questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, or more accurately the Likert-type scale, even though the two are not synonymous. The scale is named after its inventor, psychologist Rensis Likert.^[2] Likert distinguished between a scale proper, which emerges from collective responses to a set of items (usually eight or more), and the format in which responses are scored along a range. Technically speaking, a Likert scale refers only to the former. The difference between these two concepts has to do with the distinction Likert made between the underlying phenomenon being investigated and the means of capturing variation that points to the underlying phenomenon.^[3] When responding to a Likert questionnaire item, respondents specify their level of agreement or disagreement on a symmetric agree-disagree scale for a series of statements. Thus, the range captures the intensity of their feelings for a given item.^[4] A scale can be created as the simple sum questionnaire responses over the full range of the scale. In so doing, Likert scaling assumes that distances on each item are equal. Importantly, "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments" ^[5] (p. 197). By contrast modern test theory treats the difficulty of each item (the ICCs) as information to be incorporated in scaling items.

Sample question presented using a five-point Likert item

An important distinction must be made between a Likert scale and a Likert item. The Likert scale is the sum of responses on several Likert items. Because Likert items are often accompanied by a visual analog scale (e.g., a horizontal line, on which a subject indicates his or her response by circling or checking tick-marks), the items are sometimes called scales themselves. This is the source of much confusion; it is better, therefore, to reserve the term Likert scale to apply to the summed scale, and Likert item to refer to an individual item.

A Likert item is simply a statement which the respondent is asked to evaluate according to any kind of subjective or objective criteria; generally the level of agreement or disagreement is measured. It is considered symmetric or "balanced" because there are equal amounts of positive and negative positions.^[6] Often five ordered response levels are used, although many psychometricians advocate using seven or nine levels; a recent empirical study^[7] found that a 5- or 7- point scale may produce slightly higher mean scores relative to the highest possible attainable score, compared to those produced from a 10-point scale, and this difference was statistically significant. In terms of the other data characteristics, there was very little difference among the scale formats in terms of variation about the mean, skewness or kurtosis.

The format of a typical five-level Likert item, for example, could be:

Strongly disagree
Disagree
Neither agree nor disagree
Agree
Strongly agree

Likert scaling is a bipolar scaling method, measuring either positive or negative response to a statement. Sometimes an even-point scale is used, where the middle option of "Neither agree nor disagree" is not available. This is sometimes called a "forced choice" method, since the neutral option is removed.^[8] The neutral option can be seen as an easy option to take when a respondent is unsure, and so whether it is a true neutral option is questionable. A 1987 study found negligible differences between the use of "undecided" and "neutral" as the middle option in a 5-point Likert scale.^[9]

Likert scales may be subject to distortion from several causes. Respondents may avoid using extreme response categories (central tendency bias); agree with statements as presented (acquiescence bias); or try to portray themselves or their organization in a more favorable light (social desirability bias). Designing a scale with balanced keying (an equal number of positive and negative statements) can obviate the problem of acquiescence bias, since acquiescence on positively keyed items will balance acquiescence on negatively keyed items, but central tendency and social desirability are somewhat more problematic.

Scoring and analysis

After the questionnaire is completed, each item may be analyzed separately or in some cases item responses may be summed to create a score for a group of items. Hence, Likert scales are often called summative scales.

Whether individual Likert items can be considered as interval-level data, or whether they should be treated as ordered-categorical data is the subject of considerable disagreement in the literature,^[10]^[11] with strong convictions on what are the most applicable methods. This disagreement can be traced back, in many respects, to the extent to which Likert items are interpreted as being ordinal data.

There are two primary considerations in this discussion. First, Likert scales are arbitrary. The value assigned to a Likert item has no objective numerical basis, either in terms of measure theory or scale (from which a distance metric can be determined). The value assigned to each Likert item is simply determined by the researcher designing the survey, who makes the decision based on a desired level of detail. However, by convention Likert items tend to be assigned progressive positive integer values. Likert scales typically range from 2 to 10 – with 5 or 7 being the most common. Further, this progressive structure of the scale is such that each successive Likert item is treated as indicating a ‘better’ response than the preceding value. (This may differ in cases where reverse ordering of the Likert Scale is needed).

The second, and possibly more important point, is whether the ‘distance’ between each successive item category is equivalent, which is inferred traditionally. For example, in the above five-point Likert item, the inference is that the ‘distance’ between category 1 and 2 is the same as between category 3 and 4. In terms of good research practice, an equidistant presentation by the researcher is important; otherwise a bias in the analysis may result. For example, a four-point Likert item with categories "Poor", "Average", "Good", and "Very Good" is unlikely to have all equidistant categories since there is only one category that can receive a below average rating. This would arguably bias any result in favor of a positive outcome. On the other hand, even if a researcher presents what he or she believes are equidistant categories, it may not be interpreted as such by the respondent.

A good Likert scale, as above, will present a symmetry of categories about a midpoint with clearly defined linguistic qualifiers. In such symmetric scaling, equidistant attributes will typically be more clearly observed or, at least, inferred. It is when a Likert scale is symmetric and equidistant that it will behave more like an interval-level measurement. So while a Likert scale is indeed ordinal, if well presented it may nevertheless approximate an interval-level measurement. This can be beneficial since, if it was treated just as an ordinal scale, then some valuable information could be lost if the ‘distance’ between Likert items were not available for consideration. The important idea here is that the appropriate type of analysis is dependent on how the Likert scale has been presented.

Given the Likert Scale's ordinal basis, summarizing the central tendency of responses from a Likert scale by using either the median or the mode is best, with ‘spread’ measured by quartiles or percentiles.^[12] Non-parametric tests should be preferred for statistical inferences, such as chi-squared test, Mann–Whitney test, Wilcoxon signed-rank test, or Kruskal–Wallis test.^[13] While some commentators^[14] consider that parametric analysis is justified for a Likert scale using the Central Limit Theorem, this should be reserved for when the Likert scale has suitable symmetry and equidistance so an interval-level measurement can be approximated and reasonably inferred.

Responses to several Likert questions may be summed, providing that all questions use the same Likert scale and that the scale is a defensible approximation to an interval scale, in which case they may be treated as interval data measuring a latent variable. If the summed responses fulfill these assumptions, parametric statistical tests such as the analysis of variance can be applied. These can be applied only when 4 to 8 Likert questions (preferably closer to 8) are summed.^[15]

Data from Likert scales are sometimes converted to binomial data by combining all agree and disagree responses into two categories of "accept" and "reject". The chi-squared, Cochran Q, or McNemar test are common statistical procedures used after this transformation.

Consensus based assessment (CBA) can be used to create an objective standard for Likert scales in domains where no generally accepted or objective standard exists. Consensus based assessment (CBA) can be used to refine or even validate generally accepted standards.

Level of measurement

The five response categories are often believed to represent an Interval level of measurement. But this can only be the case if the intervals between the scale points correspond to empirical observations in a metric sense. Reips and Funke (2008)^[16] show that this criterion is much better met by a visual analogue scale. In fact, there may also appear phenomena which even question the ordinal scale level in Likert scales. For example, in a set of items A,B,C rated with a Likert scale circular relations like A>B, B>C and C>A can appear. This violates the axiom of transitivity for the ordinal scale.

Rasch model

Likert scale data can, in principle, be used as a basis for obtaining interval level estimates on a continuum by applying the polytomous Rasch model, when data can be obtained that fit this model. In addition, the polytomous Rasch model permits testing of the hypothesis that the statements reflect increasing levels of an attitude or trait, as intended. For example, application of the model often indicates that the neutral category does not represent a level of attitude or trait between the disagree and agree categories.

Again, not every set of Likert scaled items can be used for Rasch measurement. The data has to be thoroughly checked to fulfill the strict formal axioms of the model.

Pronunciation

Rensis Likert, the developer of the scale, pronounced his name 'lick-urt' with a short "i" sound.^[17]^[18] It has been claimed that Likert's name "is among the most mispronounced in [the] field",^[19] as many people pronounce it with a diphtong "i" sound ('lie-kurt').

References

^ Wuensch, Karl L. (October 4, 2005). "What is a Likert Scale? and How Do You Pronounce 'Likert?'". East Carolina University. Retrieved April 30, 2009.
^ Likert, Rensis (1932). "A Technique for the Measurement of Attitudes". Archives of Psychology. 140: 1–55.
^ Carifio, James and Rocco J. Perla. (2007) Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends about Likert Scales and Likert Response Formats and their Antidotes. Journal of Social Sciences 3 (3): 106-116
^ Burns, Alvin (2008). Basic Marketing Research (Second ed.). New Jersey: Pearson Education. p. 245. ISBN 978-0-13-205958-9. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ A. van Alphen, R. Halfens, A. Hasman and T. Imbos. (1994). Likert or Rasch? Nothing is more applicable than good theory. Journal of Advanced Nursing. 20, 196-201
^ Burns, Alvin (2008). Basic Marketing Research (Second ed.). New Jersey: Pearson Education. p. 250. ISBN 978-0-13-205958-9. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Dawes, John (2008). "Do Data Characteristics Change According to the number of scale points used? An experiment using 5-point, 7-point and 10-point scales". International Journal of Market Research. 50 (1): 61–77.
^ Allen, Elaine and Seaman, Christopher (2007). "Likert Scales and Data Analyses". Quality Progress 2007, 64-65.
^ Armstrong, Robert (1987). "The midpoint on a Five-Point Likert-Type Scale". Perceptual and Motor Skills: Vol 64, pp359-362.
^ Jamieson, Susan (2004). “Likert Scales: How to (Ab)use Them,” Medical Education, Vol. 38(12), pp.1217-1218
^ Norman, Geoff (2010). “Likert scales, levels of measurement and the “laws” of statistics”. Advances in Health Science Education. Vol 15(5) pp625-632
^ Jamieson, Susan (2004)
^ Mogey, Nora (March 25, 1999). "So You Want to Use a Likert Scale?". Learning Technology Dissemination Initiative. Heriot-Watt University. Retrieved April 30, 2009.
^ Norman, Geoff (2010)
^ Carifio and Perla, 2007, Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends about Likert Scales and Likert Response Formats and their Antidotes. Journal of Social Sciences 3 (3): 106-116.
^ Reips, Ulf-Dietrich (2008). "Interval level measurement with visual analogue scales in Internet-based research: VAS Generator". Behavior Research Methods. 40 (3): 699–704. doi:10.3758/BRM.40.3.699. PMID 18697664. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Babbie, Earl R. (2005). The Basics of Social Research. Belmont, CA: Thomson Wadsworth. p. 174. ISBN 0-534-63036-7.
^ Meyers, Lawrence S. (2005). Applied Multivariate Research: Design and Interpretation. Sage Publications. p. 20. ISBN 1-4129-0412-9. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Latham, Gary P. (2006). Work Motivation: History, Theory, Research, And Practice. Thousand Oaks, Calif.: Sage Publications. p. 15. ISBN 0-7619-2018-8.

External links

Carifio (2007). "Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends about Likert Scales and Likert Response Formats and their Antidotes" (PDF). Retrieved September 19, 2011. {{cite web}}: Unknown parameter |unused_data= ignored (help)
Trochim, William M. (October 20, 2006). "Likert Scaling". Research Methods Knowledge Base, 2nd Edition. Retrieved April 30, 2009.
Uebersax, John S. (2006). "Likert Scales: Dispelling the Confusion". Retrieved August 17, 2009.
"A search for the optimum feedback scale". Getfeedback.
Correlation scatter-plot matrix - for ordered-categorical data - On the visual presentation of correlation between Likert scale variables
Net stacked distribution of Likert data - Method of visualizing Likert data to highlight differences from a central neutral value.

[1] Wuensch, Karl L. (October 4, 2005). "What is a Likert Scale? and How Do You Pronounce 'Likert?'". East Carolina University. Retrieved April 30, 2009.

[2] Likert, Rensis (1932). "A Technique for the Measurement of Attitudes". Archives of Psychology. 140: 1–55.

[3] Carifio, James and Rocco J. Perla. (2007) Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends about Likert Scales and Likert Response Formats and their Antidotes. Journal of Social Sciences 3 (3): 106-116

[4] Burns, Alvin (2008). Basic Marketing Research (Second ed.). New Jersey: Pearson Education. p. 245. ISBN 978-0-13-205958-9. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[vanAlphen1994-5] A. van Alphen, R. Halfens, A. Hasman and T. Imbos. (1994). Likert or Rasch? Nothing is more applicable than good theory. Journal of Advanced Nursing. 20, 196-201

[6] Burns, Alvin (2008). Basic Marketing Research (Second ed.). New Jersey: Pearson Education. p. 250. ISBN 978-0-13-205958-9. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[7] Dawes, John (2008). "Do Data Characteristics Change According to the number of scale points used? An experiment using 5-point, 7-point and 10-point scales". International Journal of Market Research. 50 (1): 61–77.

[8] Allen, Elaine and Seaman, Christopher (2007). "Likert Scales and Data Analyses". Quality Progress 2007, 64-65.

[9] Armstrong, Robert (1987). "The midpoint on a Five-Point Likert-Type Scale". Perceptual and Motor Skills: Vol 64, pp359-362.

[10] Jamieson, Susan (2004). “Likert Scales: How to (Ab)use Them,” Medical Education, Vol. 38(12), pp.1217-1218

[11] Norman, Geoff (2010). “Likert scales, levels of measurement and the “laws” of statistics”. Advances in Health Science Education. Vol 15(5) pp625-632

[12] Jamieson, Susan (2004)

[stats-13] Mogey, Nora (March 25, 1999). "So You Want to Use a Likert Scale?". Learning Technology Dissemination Initiative. Heriot-Watt University. Retrieved April 30, 2009.

[14] Norman, Geoff (2010)

[carifo-15] Carifio and Perla, 2007, Ten Common Misunderstandings, Misconceptions, Persistent Myths and Urban Legends about Likert Scales and Likert Response Formats and their Antidotes. Journal of Social Sciences 3 (3): 106-116.

[16] Reips, Ulf-Dietrich (2008). "Interval level measurement with visual analogue scales in Internet-based research: VAS Generator". Behavior Research Methods. 40 (3): 699–704. doi:10.3758/BRM.40.3.699. PMID 18697664. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[17] Babbie, Earl R. (2005). The Basics of Social Research. Belmont, CA: Thomson Wadsworth. p. 174. ISBN 0-534-63036-7.

[18] Meyers, Lawrence S. (2005). Applied Multivariate Research: Design and Interpretation. Sage Publications. p. 20. ISBN 1-4129-0412-9. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[19] Latham, Gary P. (2006). Work Motivation: History, Theory, Research, And Practice. Thousand Oaks, Calif.: Sage Publications. p. 15. ISBN 0-7619-2018-8.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

@@ Line 49: / Line 49: @@
 == Level of measurement ==
-The five response categories are often believed to represent an Interval [[level of measurement]]. But this can only be the case if the intervals between the scale points correspond to empirical observations in a metric sense. Reips and Funke (2008)<ref>{{cite journal |last=Reips |first=Ulf-Dietrich |coauthors=Funke, Frederik |year=2008 |title=Interval level measurement with visual analogue scales in Internet-based research: VAS Generator |journal=Behavior Research Methods |volume=40 |issue=3 |pages=699–704}}</ref> show that this criterion is much better met by a [[visual analogue scale]]. In fact, there may also appear phenomena which even question the ordinal scale level in Likert scales. For example, in a set of items A,B,C rated with a Likert scale circular relations like A>B, B>C and C>A can appear. This violates the [[Armstrong's axioms#Axioms|axiom of transitivity]] for the ordinal scale.
+The five response categories are often believed to represent an Interval [[level of measurement]]. But this can only be the case if the intervals between the scale points correspond to empirical observations in a metric sense. Reips and Funke (2008)<ref>{{cite journal |last=Reips |first=Ulf-Dietrich |coauthors=Funke, Frederik |year=2008 |title=Interval level measurement with visual analogue scales in Internet-based research: VAS Generator |journal=Behavior Research Methods |volume=40 |issue=3 |pages=699–704 |doi=10.3758/BRM.40.3.699 |pmid=18697664}}</ref> show that this criterion is much better met by a [[visual analogue scale]]. In fact, there may also appear phenomena which even question the ordinal scale level in Likert scales. For example, in a set of items A,B,C rated with a Likert scale circular relations like A>B, B>C and C>A can appear. This violates the [[Armstrong's axioms#Axioms|axiom of transitivity]] for the ordinal scale.
 == Rasch model ==