Jump to content

User:Sdraaijer/standard setting

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Sdraaijer (talk | contribs) at 16:42, 14 October 2013 (→‎Compromise methods). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Standard setting[1][2] is a term for establishing cutscores for a test. A cutscore, also known as cut-off score, passing score, pass-fail score or passing point, is a point on a score continuum that differentiates between classifications along the continuum. The most common cutscore is a score that differentiates between the classifications of "pass" and "fail" on a professional or educational test.

There is no single 'correct' method for standard setting[3]. Standard setting can be done with great rigour in the form of standard setting studies or with less rigour to provide practical methods for everyday use by teachers in secondary or higher education.

Standard setting is related to grading which involves translating a position on a score continuum into a grade. Obviously, the cut score has an influence on the resulting grade given a specific score.

The main categorization of methods is that in Absolute methodes, Relative methods and Compromise methods.

Absolute methods

In absolute methods, cut scores are set at a certain pre-determined point on the score scale. Everyone below that point fails, everyone above passes, irrespective of the percentage of students that pass or fail. The assumption to use absolute methods is that the subject matter that is queried in a test does not differ from test to test and that there is fully (or at least) reasonable control over the level of difficulty of the items in a test. Absolute methods are related to criterion-referenced testing.

  • 60% Method (absolute method): In the 60% method (common in higher education in the Netherlands[4]) the cut score is set to a fixed percentage of the maximum score for a test. Typically this cut score is set to 60% (hence the name of the method). The main reasoning behind this standard is that a teacher believes that for students to qualify for a pass, a little bit more proficiency has to be demonstrated that just answering half of the items of a test correctly.

Relative methods

In relative methods, cut scores are set in such a way that a pre-determined percentage of test-takers passes the test, irrespective of the point on the scoring scale for the cut score. The argumentation to use such methods is that in the practice of education, the score of the group of students is influenced by the accidental quality of the previous education and the accidental level of difficulty of the test or other accidental circumstances. The assumption is that each student group is in principal equally able and therefore, the standard should be set related to the distribution of the scores a group achieves. Relative methods are related to norm referenced testing.

  • Core Item[5] (relative method): In the core-item method, a teacher identifies a limited set of items that are regarded the most important for the subject matter. This set is smaller than the set of items for the whole test. Now, the average score on the items of the core set (often adjusted for guessing) is used as the cut score average needed to pass for the whole test.
  • Wijnen[6] (Relative method): In the Wijnen method, the cutscore is set a the average score for all students on a test, compensated for the unreliability of the test. A teacher can choose to lower the cut score with (arbitrarily) one or two times the standard error of the test. The reasoning behind this method is that the number of students that fail incorrectly because of measurement error (false negative error) should be minimised.
  • Grading on the Curve: Grading on a curve (also known as curved grading, bell curving, or simply curving) is a statistical method of assigning grades designed to yield a pre-determined distribution of grades among the student population. This ratio can be adapted on the basis of the reliability of the test (Wijnen Method). For example, the pass-fail ratio is fixed at 60% of the student population and the cut score is set to match that ratio. Grades are then calculated accordingly.

Compromise methods

Compromise methods combine the judgment of the standard setters or absolute setting with information about the realities and consequences of different pass rates or score ranges. For example, when it turns out that given a certain cutscore, the percentage of failing students is very high, a correction for the cutscore can be proposed.

  • Beuk[7] (compromise method): In the Beuk method, each judge in the cut score study is asked to estimate what passing rate should be expected for the exam. This question is posed and answered only after the judges have provided their estimates of the difficulty level for each test question.
  • Hofstee[8] (compromise method): In the Hofstee method, raters or judges define the highest acceptable cut score, the lowest acceptable cut score, highest acceptable fail rate, and the lowest acceptable fail rate. These are plotted against a curve of participants’ score data, and the intersection is used as a cut score.
  • Cohen and Van der Vleuten[9] (compromise method): The Cohen and Van der Vleuten method is a standard setting method with the best performing students as point of reference. In the method, the score of the best student or an averaged score of for example the best 5 students, serves as score to award the maximum grade. This score is in general lower than the score that could be maximally awarded for a test. In relation to this lower maximum score, the cutscore for the test is set relatively lower also. The reasoning behind the validity of applying this method is that it should be expected that in a group of students it should be possible to achieve the maximum grade. If there are no students that achieve the maximum grade, than this is assumed to be due to a not-perfect course, a not-perfect test or too high standards that have (implicitely) been set by the teacher.
  • De Gruijter:
  • Others

See for some discussion: http://www.act.org/research/researchers/reports/pdf/ACT_RR89-2.pdf

References

  1. ^ Cizek, G. J., & Bunch, M. B. (2007). Standard setting: a guide to establishing and evaluating performance standards on tests. Sage Publications.
  2. ^ Bejar, I. I. (2008). Standard Setting: What Is It? Why Is It Important? R & D Connections, 7, 1–6.
  3. ^ Cohen-Schotanus, J., & van der Vleuten, C. P. (2010). A standard setting method with the best performing students as point of reference: Practical and affordable. Medical teacher, 32(2), 154–160.
  4. ^ Berkel, H. van, & Bax, A. (2002). Toetsen in het Hoger Onderwijs. Houten/Diegem: Bohn Stafleu Van Loghum.
  5. ^ De Groot, A. D. (1964). De kernitem-methode voor de bepaling van de cesuur voldoende/onvoldoende. Paedagogische Studiën, 41, 425–440.
  6. ^ Wijnen, W. H. F. (1971). Onder of boven de maat. Lisse: Swets and Zeitlinger.
  7. ^ Beuk, C. H. (1984). A Method for Reaching a Compromise Between Absolute and Relative Standards in Examinations. Journal of Educational Measurement, 21(2), 147–152. doi:10.1111/j.1745-3984.1984.tb00226.x
  8. ^ Hofstee, W. K. B. (1977). Cesuurprobleem opgelost. Onderzoek van Onderwijs, 6, 6–7.
  9. ^ Cohen-Schotanus, J., & van der Vleuten, C. P. (2010). A standard setting method with the best performing students as point of reference: Practical and affordable. Medical teacher, 32(2), 154–160.