Jump to content

Educational assessment: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Moved table to its own section after more development of theory, before controversies. The strikethrough reference has been "commented", References sectioned made just "References", and the comment there was removed as it no longer held relevance.
Page moved, changed to a redirect to disambiguation page
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
#REDIRECT [[Assessment (disambiguation)]]
{{Dablink|This article covers '''educational assessment''' including the work of [[institutional research|institutional researchers]].}}
'''Educational assessment''' is the process of [[documenting]], usually in measurable terms, [[knowledge]], [[skill]]s, [[attitude]]s and [[belief]]s. Assessment can focus on the individual learner, the learning community (class, workshop, or other organized group of learners), the institution, or the educational system as a whole. According to the ''Academic Exchange Quarterly'': "Studies of a theoretical or empirical nature (including case studies, portfolio studies, exploratory, or experimental work) addressing the assessment of learner aptitude and preparation, motivation and learning styles, learning outcomes in achievement and satisfaction in different educational contexts are all welcome, as are studies addressing issues of measurable standards and benchmarks".<ref name="Academic Exchange Quarterly">"Educational Assessment". Academic Exchange Quarterly, available at [http://rapidintellect.com/AEQweb/ontass Rapidintellect.com]. Retrieved January 28, 2009.</ref>

It is important to notice that the final purposes and assessment practices in education depends on the ''theoretical framework'' of the practitioners and researchers, their assumptions and beliefs about the nature of human mind, the origin of knowledge and the process of learning.

==Alternate meanings==
According to the Merriam-Webster online dictionary the word ''assessment'' comes from the root word ''assess'' which is defined as:
# to determine the rate or amount of (as a tax)
# to impose (as a tax) according to an established rate b: to subject to a tax, charge, or levy
# to make an official valuation of (property) for the purposes of taxation
# to determine the importance, size, or value of (assess a problem)
# to charge (a player or team) with a foul or penalty

Assessment in education is best described as an action "to determine the importance, size, or value of."<ref name="Dictionary Entry">Merriam-Webster Dictionary (2005). Available at [http://dictionary.reference.com/browse/assess Dictionary.reference.com]. Retrieved on 1/28/2009.</ref>

==Types==
The term ''assessment'' is generally used to refer to all activities teachers use to help students learn and to gauge student progress.<ref name="Black Box">Black, Paul, & William, Dylan (October 1998). "Inside the Black Box: Raising Standards Through Classroom Assessment."Phi Beta Kappan. Available at [http://www.pdkintl.org/kappan/kbla9810.htm PDKintl.org]. Retrieved January 28, 2009.</ref> Though the notion of assessment is generally more complicated than the following categories suggest, assessment is often divided for the sake of convenience using the following distinctions:
# formative and summative
# objective and subjective
# referencing (criterion-referenced, norm-referenced, and [[ipsative]])
# informal and formal.

===Formative and summative===
Assessment is often divided into formative and summative categories for the purpose of considering different objectives for assessment practices.
* [[Summative assessment]] - Summative assessment is generally carried out at the end of a course or project. In an educational setting, summative assessments are typically used to assign students a course grade. Summative assessments are evaluative.
* [[Formative assessment]] - Formative assessment is generally carried out throughout a course or project. Formative assessment, also referred to as "educative assessment," is used to aid learning. In an educational setting, formative assessment might be a teacher (or [[peer group|peer]]) or the learner, providing feedback on a student's work, and would not necessarily be used for grading purposes. Formative assessments are diagnostic.

Educational researcher [http://www.ed.uiuc.edu/circe/Robert_Stake.html Robert Stake] explains the difference between formative and summative assessment with the following analogy: {{cquote|When the cook tastes the soup, that's formative. When the guests taste the soup, that's summative.<ref name="Stake in Scriven">Scriven, M. (1991). Evaluation thesaurus. 4th ed. Newbury Park, CA: [[Sage Publications]]. ISBN 0-8039-4364-4.</ref>}}

Summative and formative assessment are often referred to in a learning context as ''assessment of learning'' and ''assessment for learning'' respectively. Assessment of learning is generally summative in nature and intended to measure learning outcomes and report those outcomes to students, parents, and administrators. Assessment of learning generally occurs at the conclusion of a class, course, semester, or academic year. Assessment for learning is generally formative in nature and is used by teachers to consider approaches to teaching and next steps for individual learners and the class.<ref name="Earl, Lorna">Earl, Lorna (2003). Assessment as Learning: Using Classroom Assessment to Maximise Student Learning. Thousand Oaks, CA, Corwin Press. ISBN 0-7619-4626-8. Available at [http://www.wyoaac.org/Lit/assessment%20for%20learning%20of%20learning%20as%20learning%20-%20Earl.pdf WYOAAC.org], Accessed January 23, 2009. </ref>

A common form of formative assessment is ''diagnostic assessment''. Diagnostic assessment measures a student's current knowledge and skills for the purpose of identifying a suitable program of learning. ''Self-assessment'' is a form of diagnostic assessment which involves students assessing themselves. ''Forward-looking assessment'' asks those being assessed to consider themselves in hypothetical future situations.<ref name="Diagnostic assessment">Reed, Daniel. "Diagnostic Assessment in Language Teaching and Learning." Center for Language Education and Research, available at [http://www.google.com/url?sa=t&source=web&ct=res&cd=2&url=http%3A%2F%2Fclear.msu.edu%2Fclear%2Fnewsletter%2Ffiles%2Ffall2006.pdf&ei=HNKBSeOuHYH8tgfS7rwZ&usg=AFQjCNFPkla4C_1Uyr1EOvg-nCLX0I9Pgw&sig2=_f3pOANBQc1cO6s7ZPexBg Google.com]. Retrieved January 28, 2009.</ref>

''Performance-based assessment'' is similar to summative assessment, as it focuses on achievement. It is often aligned with the [[standards-based education reform]] and [[outcomes-based education]] movement. Though ideally they are significantly different from a traditional multiple choice test, they are most commonly associated with [[standards-based assessment]] which use free-form responses to standard questions scored by human scorers on a standards-based scale, meeting, falling below, or exceeding a performance standard rather than being ranked on a curve. A well-defined task is identified and students are asked to create, produce, or do something, often in settings that involve real-world application of knowledge and skills. Proficiency is demonstrated by providing an extended response. Performance formats are further differentiated into products and performances. The performance may result in a product, such as a painting, portfolio, paper, or exhibition, or it may consist of a performance, such as a speech, athletic skill, musical recital, or reading.

===Objective and subjective===
Assessment (either summative or formative) is often categorized as either objective or subjective. Objective assessment is a form of questioning which has a single correct answer. Subjective assessment is a form of questioning which may have more than one correct answer (or more than one way of expressing the correct answer). There are various types of objective and subjective questions. Objective question types include true/false answers, [[multiple choice]], multiple-response and matching questions. Subjective questions include extended-response questions and essays. Objective assessment is well suited to the increasingly popular computerized or [[e-assessment|online assessment]] format.

Some have argued that the distinction between objective and subjective assessments is neither useful nor accurate because, in reality, there is no such thing as "objective" assessment. In fact, all assessments are created with inherent biases built into decisions about relevant subject matter and content, as well as cultural (class, ethnic, and gender) biases.<ref name="Joint Information Systems Committee (JISC)">Joint Information Systems Committee (JISC). "What Do We Mean by e-Assessment?" JISC InfoNet, available at [http://www.jiscinfonet.ac.uk/InfoKits/effective-use-of-VLEs/e-assessment/assess-overview JISCinfonet.ac.uk]. Retrieved January 29, 2009.</ref>

===Basis of comparison===
Test results can be compared against an established criterion, or against the performance of other students, or against previous performance:

''Criterion-referenced assessment'', typically using a [[criterion-referenced test]], as the name implies, occurs when candidates are measured against defined (and objective) criteria. Criterion-referenced assessment is often, but not always, used to establish a person's competence (whether s/he can do something). The best known example of criterion-referenced assessment is the driving test, when learner drivers are measured against a range of explicit criteria (such as "Not endangering other road users").

''Norm-referenced assessment'' (colloquially known as "[[Bell curve grading|grading on the curve]]"), typically using a [[norm-referenced test]], is not measured against defined criteria. This type of assessment is relative to the student body undertaking the assessment. It is effectively a way of comparing students. The IQ test is the best known example of norm-referenced assessment. Many entrance tests (to prestigious schools or universities) are norm-referenced, permitting a fixed proportion of students to pass ("passing" in this context means being accepted into the school or university rather than an explicit level of ability). This means that standards may vary from year to year, depending on the quality of the cohort; criterion-referenced assessment does not vary from year to year (unless the criteria change).<ref name="VirginiaTech">Educational Technologies at Virginia Tech. "Assessment Purposes." VirginiaTech DesignShop: Lessons in Effective Teaching, available at [http://www.edtech.vt.edu/edtech/id/assess/purposes.html Edtech.vt.edu]. Retrieved January 29, 2009.</ref>

''[[Ipsative assessment]]'' is self comparison either in the same domain over time, or comparative to other domains within the same student.

===Informal and formal===
Assessment can be either ''formal'' or ''informal''. Formal assessment usually implies a written document, such as a test, quiz, or paper. A formal assessment is given a numerical score or grade based on student performance, whereas an informal assessment does not contribute to a student's final grade. An informal assessment usually occurs in a more casual manner and may include observation, inventories, checklists, rating scales, [[rubric (academic)|rubrics]], performance and portfolio assessments, participation, peer and self evaluation, and discussion.<ref name="Valencia, Sheila W">Valencia, Sheila W. "What Are the Different Forms of Authentic Assessment?" Understanding Authentic Classroom-Based Literacy Assessment (1997), available at [http://www.eduplace.com/rdg/res/litass/forms.html Eduplace.com]. Retrieved January 29, 2009.</ref>

===Internal and external===
Internal assessment is set and marked by the school (i.e. teachers). Students get the mark and feedback regarding the assessment. External assessment is set by the governing body, and is marked by non-biased personnel. With external assessment, students only receive a mark. Therefore, they have no idea how they actually performed (i.e. what bits they answered correctly.)

==Standards of quality==
In general, high-quality assessments are considered those with a high level of [[reliability (statistics)|reliability]] and [[validity (statistics)|validity]]. Approaches to reliability and validity vary, however.

===Reliability===
[[Reliability (statistics)|Reliability]] relates to the consistency of an assessment. A reliable assessment is one which consistently achieves the same results with the same (or similar) cohort of students. Various factors affect reliability—including ambiguous questions, too many options within a question paper, vague marking instructions and poorly trained markers. Traditionally, the reliability of an assessment is based on the following:
# Temporal stability: Performance on a test is comparable on two or more separate occasions.
# Form equivalence: Performance among examinees is equivalent on different forms of a test based on the same content.
# Internal consistency: Responses on a test are consistent across questions. For example: In a survey that asks respondents to rate attitudes toward technology, consistency would be expected in responses to the following questions:
#* "I feel very negative about computers in general."
#* "I enjoy using computers."<ref name="Yu, Chong Ho">Yu, Chong Ho (2005). "Reliability and Validity." Educational Assessment. Available at [http://www.creative-wisdom.com/teaching/assessment/reliability.html Creative-wisdom.com]. Retrieved January 29, 2009.</ref>

Reliability can also be expressed in mathematical terms as:
'''Rx = VT/Vx''' where Rx is the reliability in the observed (test) score, X;
Vt and Vx are the variability in ‘true’ (i.e., candidate’s innate performance) and measured test scores respectively. The Rx can range from 0 (completely unreliable), to 1 (completely reliable). An Rx of 1 is rarely achieved, and an Rx of 0.8 is generally considered reliable. <ref>{{cite journal |author=Vergis A, Hardy K |title=Principles of Assessment: A Primer for Medical Educators in the Clinical Years |journal=The Internet Journal of Medical Education | volume=1 |issue=1 | year=2010 | url=http://www.ispub.com/journal/the_internet_journal_of_medical_education/volume_1_number_1_74/article_printable/principles-of-assessment-a-primer-for-medical-educators-in-the-clinical-years-4.html}}</ref>

===Validity===
A [[validity (statistics)|valid]] assessment is one which measures what it is intended to measure. For example, it would not be valid to assess driving skills through a written test alone. A more valid way of assessing driving skills would be through a combination of tests that help determine what a driver knows, such as through a written test of driving knowledge, and what a driver is able to do, such as through a performance assessment of actual driving. Teachers frequently complain that some examinations do not properly assess the [[syllabus]] upon which the examination is based; they are, effectively, questioning the validity of the exam.
Validity of an assessment is generally gauged through examination of evidence in the following categories:
# Content – Does the content of the test measure stated objectives?
# Criterion – Do scores correlate to an outside reference? (ex: Do high scores on a 4th grade reading test accurately predict reading skill in future grades?)
# Construct – Does the assessment correspond to other significant variables? (ex: Do [[English as a second language|ESL]] students consistently perform differently on a writing exam than native English speakers?)<ref name="Moskal, Barbara M., & Leydens, Jon A">Moskal, Barbara M., & Leydens, Jon A (2000). "Scoring Rubric Development: Validity and Reliability." Practical Assessment, Research & Evaluation, 7(10). Retrieved January 30, 2009 from [http://PAREonline.net/getvn.asp?v=7&n=10 PAREonline.net]</ref>
# Face – Does the item or theory make sense, and is it seemingly correct to the expert reader?<ref>{{cite journal |author=Vergis A, Hardy K |title=Principles of Assessment: A Primer for Medical Educators in the Clinical Years |journal=The Internet Journal of Medical Education | volume=1 |issue=1 | year=2010 |
url=http://www.ispub.com/journal/the_internet_journal_of_medical_education/volume_1_number_1_74/article_printable/principles-of-assessment-a-primer-for-medical-educators-in-the-clinical-years-4.html}}</ref>

A good assessment has both validity and reliability, plus the other quality attributes noted above for a specific context and purpose. In practice, an assessment is rarely totally valid or totally reliable. A ruler which is marked wrong will always give the same (wrong) measurements. It is very reliable, but not very valid. Asking random individuals to tell the time without looking at a clock or watch is sometimes used as an example of an assessment which is valid, but not reliable. The answers will vary between individuals, but the average answer is probably close to the actual time. In many fields, such as medical research, educational testing, and psychology, there will often be a trade-off between reliability and validity. A history test written for high validity will have many essay and fill-in-the-blank questions. It will be a good measure of mastery of the subject, but difficult to score completely accurately. A history test written for high reliability will be entirely multiple choice. It isn't as good at measuring knowledge of history, but can easily be scored with great precision. We may generalize from this. The more reliable our estimate is of what we purport to measure, the less certain we are that we are actually measuring that aspect of attainment. It is also important to note that there are at least thirteen sources of invalidity, which can be estimated for individual students in test situations. They never are. Perhaps this is because their social purpose demands the absence of any error, and validity errors are usually so high that they would destabilize the whole assessment industry.

It is well to distinguish between "subject-matter" validity and "predictive" validity. The former, used widely in education, predicts the score a student would get on a similar test but with different questions. The latter, used widely in the workplace, predicts performance. Thus, a subject-matter-valid test of knowledge of driving rules is appropriate while a predictively-valid test would assess whether the potential driver could follow those rules.

===Testing standards===
In the field of [[psychometrics]], the [[Standards for Educational and Psychological Testing]]<ref>[http://www.apa.org/science/standards.html#overview ''The Standards for Educational and Psychological Testing'']</ref> place standards about validity and reliability, along with [[Measurement#Difficulties|errors of measurement]] and related considerations under the general topic of test construction, evaluation and documentation. The second major topic covers standards related to fairness in testing, including [[justice|fairness]] in testing and test use, the [[right]]s and [[social responsibility|responsibilities]] of test takers, testing individuals of diverse [[language|linguistic backgrounds]], and testing individuals with [[disability|disabilities]]. The third and final major topic covers standards related to testing applications, including the responsibilities of test users, [[psychological testing|psychological testing and assessment]], [[Test (student assessment)|educational testing and assessment]], testing in [[employment]] and [[professional certification|credentialing]], plus testing in [[program evaluation]] and [[standardized testing and public policy|public policy]].

===Evaluation standards===
In the field of [[evaluation]], and in particular [[educational evaluation]], the [[Joint Committee on Standards for Educational Evaluation]]<ref>[http://www.wmich.edu/evalctr/jc/ Joint Committee on Standards for Educational Evaluation]</ref> has published three sets of standards for evaluations. "The Personnel Evaluation Standards"<ref>Joint Committee on Standards for Educational Evaluation. (1988). "[http://www.wmich.edu/evalctr/jc/PERSTNDS-SUM.htm The Personnel Evaluation Standards: How to Assess Systems for Evaluating Educators.]" Newbury Park, CA: Sage Publications.</ref> was published in 1988, ''The Program Evaluation Standards'' (2nd edition)<ref>Joint Committee on Standards for Educational Evaluation. (1994). ''[http://www.wmich.edu/evalctr/jc/PGMSTNDS-SUM.htm The Program Evaluation Standards, 2nd Edition.]'' Newbury Park, CA: Sage Publications.</ref> was published in 1994, and ''The Student Evaluation Standards''<ref>Committee on Standards for Educational Evaluation. (2003). ''[http://www.wmich.edu/evalctr/jc/briefing/ses/ The Student Evaluation Standards: How to Improve Evaluations of Students.]'' Newbury Park, CA: Corwin Press.</ref> was published in 2003.

Each publication presents and elaborates a set of standards for use in a variety of educational settings. The standards provide guidelines for designing, implementing, assessing and improving the identified form of evaluation. Each of the standards has been placed in one of four fundamental categories to promote educational evaluations that are proper, useful, feasible, and accurate. In these sets of standards, validity and reliability considerations are covered under the accuracy topic. For example, the student accuracy standards help ensure that student evaluations will provide sound, accurate, and credible information about student learning and performance.

== Summary table of the main theoretical frameworks ==

The following table summarizes the main ''theoretical frameworks'' behind almost all the theoretical and research work, and the instructional practices in education (one of them being, of course, the practice of assessment). These different frameworks have given rise to interesting debates among scholars.

{| class="wikitable" border="1"
|-
! TOPICS
! EMPIRICISM
! RATIONALISM
! SOCIOCULTURALISM

|-
| '''Philosophical orientation'''
| [[David Hume|Hume]]: [[Empiricism|British empiricism]]
| [[Immanuel Kant|Kant]], [[René Descartes|Descartes]]: [[Rationalism|Continental rationalism]]
| [[Georg Wilhelm Friedrich Hegel|Hegel]], [[Karl Marx|Marx]]: [[Dialectic|cultural dialectic]]
|-
| '''Metaphorical Orientation'''
| Mechanistic/Operation of a Machine or Computer
| Organismic/Growth of a Plant
| Contextualist/Examination of a Historical Event
|-
| '''Leading Theorists'''
| [[B. F. Skinner]] ([[behaviorism]])/ [[Herb Simon]], [[John H. D. Anderson|John Anderson]], [[Robert M. Gagné|Robert Gagné]]: ([[cognitivism]])
| [[Jean Piaget]]/[[Neo-Piagetian theories of cognitive development#Robbie Case|Robbie Case]]
| [[Lev Vygotsky]], [[Alexander Luria|Luria]], [[Jerome Bruner|Bruner]]/Alan Collins, Jim Greeno, [[Ann Brown]], [[John D. Bransford|John Bransford]]
|-
| '''Nature of Mind'''
| Initially blank device that detects patterns in the world and operates on them. Qualitatively identical to lower animals, but quantitatively superior.
| Organ that evolved to acquire knowledge by making sense of the world. Uniquely human, qualitatively different from lower animals.
| Unique among species for developing language, tools, and education.
|-
| '''Nature of Knowledge'''
(epistemology)
| Hierarchically organized associations that present an accurate but incomplete representation of the world. Assumes that the sum of the components of knowledge is the same as the whole. Because knowledge is accurately represented by components, one who demonstrates those components is presumed to know
| General and/or specific cognitive and conceptual structures, constructed by the mind and according to rational criteria. Essentially these are the higher-level structures that are constructed to assimilate new info to existing structure and as the structures accommodate more new info. Knowledge is represented by ability to solve new problems.
| Distributed across people, communities, and physical environment. Represents culture of community that continues to create it. To know means to be attuned to the constraints and affordances of systems in which activity occurs. Knowledge is represented in the regularities of successful activity.
|-
| '''Nature of Learning''' (the process by which knowledge is increased or modified)
| Forming and strengthening cognitive or S-R associations. Generation of knowledge by (1) exposure to pattern, (2) efficiently recognizing and responding to pattern (3) recognizing patterns in other contexts.
| Engaging in active process of making sense of ("rationalizing") the environment. Mind applying existing structure to new experience to rationalize it. You don't really learn the components, only structures needed to deal with those components later.
| Increasing ability to participate in a particular community of practice. Initiation into the life of a group, strengthening ability to participate by becoming attuned to constraints and affordances.
|-
| '''Features of Authentic Assessment'''
| Assess knowledge components. Focus on mastery of many components and fluency. Use psychometrics to standardize.
| Assess extended performance on new problems. Credit varieties of excellence.
| Assess participation in inquiry and social practices of learning (e.g. portfolios, observations) Students should participate in assessment process. Assessments should be integrated into larger environment
|}

==Controversy==
Concerns over how best to apply assessment practices across public school systems have largely focused on questions about the use of high stakes testing and standardized tests, often used to gauge student progress, teacher quality, and school-, district-, or state-wide educational success.

===No Child Left Behind===
For most researchers and practitioners, the question is not whether tests should be administered at all—there is a general consensus that, when administered in useful ways, tests can offer useful information about student progress and curriculum implementation, as well as offering formative uses for learners.<ref name="APA">American Psychological Association. "Appropriate Use of High-Stakes Testing in Our Nation's Schools." APA Online, available at <!--<s>www.apa.org/pubinfo/testing.html. Retrieved January 29, 2009.</s> Unnecessary to leave this here.--> [http://www.apa.org/pubs/info/brochures/testing.aspx APA.org], Retrieved January 24, 2010</ref> The real issue, then, is whether testing practices as currently implemented can provide these services for educators and students.

In the U.S., the [[No Child Left Behind Act]] mandates standardized testing nationwide. These tests align with state curriculum and link teacher, student, district, and state accountability to the results of these tests. Proponents of NCLB argue that it offers a tangible method of gauging educational success, holding teachers and schools accountable for failing scores, and closing the [[achievement gap]] across class and ethnicity.<ref>(nd) [http://www.ed.gov/nclb/landing.jhtml Reauthorization of NCLB]. Department of Education. Retrieved 1/29/09.</ref>

Opponents of standardized testing dispute these claims, arguing that holding educators accountable for test results leads to the practice of "teaching to the test." Additionally, many argue that the focus on standardized testing encourages teachers to equip students with a narrow set of skills that enhance test performance without actually fostering a deeper understanding of subject matter or key principles within a knowledge domain.<ref>(nd) [http://www.fairtest.org/facts/whatwron.htm What's Wrong With Standardized Testing?] FairTest.org. Retrieved January 29, 2009.</ref>

===High-stakes testing===
{{Main|High-stakes testing}}
The assessments which have caused the most controversy in the U.S. are the use of [[high school graduation examination]]s, which are used to deny diplomas to students who have attended high school for four years, but cannot demonstrate that they have learned the required material. Opponents say that no student who has put in four years of [[seat time]] should be denied a high school diploma merely for repeatedly failing a test, or even for not knowing the required material.<ref>{{cite news |title=Reform education, not exit exams |work=Daily Bruin |author=Dang, Nick |date=18 March 2003 |quote=One common complaint from failed test-takers is that they weren't taught the tested material in school. Here, inadequate schooling, not the test, is at fault. Blaming the test for one's failure is like blaming the service station for a failed smog check; it ignores the underlying problems within the 'schooling vehicle.' |url=http://dailybruin.ucla.edu/stories/2003/mar/18/reform-education-not-exit-exam/}}</ref><ref>{{cite news |title=Blame the test: LAUSD denies responsibility for low scores |author=Weinkopf, Chris |url=http://www.thefreelibrary.com/BLAME+THE+TEST+LAUSD+DENIES+RESPONSIBILITY+FOR+LOW+SCORES-a086659557 |date=2002 |quote=The blame belongs to 'high-stakes tests' like the Stanford 9 and California's High School Exit Exam. Reliance on such tests, the board grumbles, 'unfairly penalizes students that have not been provided with the academic tools to perform to their highest potential on these tests'. |work=Daily News}}</ref><ref name=IBD>{{cite news |title=Blaming The Test |work=[[Investor's Business Daily]] |date=11 May 2006 |quote=A judge in California is set to strike down that state's high school exit exam. Why? Because it's working. It's telling students they need to learn more. We call that useful information. To the plaintiffs who are suing to stop the use of the test as a graduation requirement, it's something else: Evidence of unequal treatment... the exit exam was deemed unfair because too many students who failed the test had too few credentialed teachers. Well, maybe they did, but granting them a diploma when they lack the required knowledge only compounds the injustice by leaving them with a worthless piece of paper." |url=http://old.investors.com/editorial/editorialcontent.asp?secid=1501&status=article&id=155734&secure=3598}}</ref>

High-stakes tests have been blamed for causing sickness and [[test anxiety]] in students and teachers, and for teachers choosing to narrow the curriculum towards what the teacher believes will be tested. In an exercise designed to make children comfortable about testing, a Spokane, Washington newspaper published a picture of a [[monster]] that feeds on fear.<ref>[http://www2.asd.wednet.edu/Pioneer/barnard/projects/04-05/art/WhatsaWASL/index.html ASD.wednet.edu]</ref> The published image is purportedly the response of a student who was asked to draw a picture of what she thought of the state assessment.

Other critics, such as Washington State University's [[Don Orlich]], question the use of test items far beyond standard cognitive levels for students' age.<ref name="Bach, Deborah, & Blanchard, Jessica">Bach, Deborah, & Blanchard, Jessica (April 19, 2005). "WASL worries stress kids, schools." Seattle Post-Intelligencer. Retrieved January 30, 2009 from [http://seattlepi.nwsource.com/local/220713_wasl19.html Seattlepi.nwsource.com].</ref>

Compared to portfolio assessments, simple multiple-choice tests are much less expensive, less prone to disagreement between scorers, and can be scored quickly enough to be returned before the end of the school year. [[Standardized test]]s (all students take the same test under the same conditions) often use multiple-choice tests for these reasons. Orlich criticizes the use of expensive, holistically graded tests, rather than inexpensive multiple-choice "bubble tests", to measure the quality of both the system and individuals for very large numbers of students.<ref name="Bach, Deborah, & Blanchard, Jessica"/> Other prominent critics of high-stakes testing include [[Fairtest]] and [[Alfie Kohn]].

The use of [[IQ tests]] has been banned in some states for educational decisions, and [[norm-referenced tests]], which rank students from "best" to "worst", have been criticized for bias against minorities. Most education officials support [[criterion-referenced tests]] (each individual student's score depends solely on whether he answered the questions correctly, regardless of whether his neighbors did better or worse) for making high-stakes decisions.

===21st century assessment===
It has been widely noted that with the emergence of [[social media]] and [[Web 2.0]] technologies and mindsets, learning is increasingly collaborative and knowledge increasingly distributed across many members of a learning community. Traditional assessment practices, however, focus in large part on the individual and fail to account for knowledge-building and learning in context. As researchers in the field of assessment consider the cultural shifts that arise from the emergence of a more [[participatory culture]], they will need to find new methods of applying assessments to learners.<ref name="Fadel, Charles, Honey, Margaret, & Pasnik, Shelley">Fadel, Charles, Honey, Margaret, & Pasnik, Shelley (May 18, 2997). "Assessment in the Age of Innovation." Education Week. Retrieved January 29, 2009 from [http://www.edweek.org/login.html?source=http://www.edweek.org/ew/articles/2007/05/23/38fadel.h26.html&destination=http://www.edweek.org/ew/articles/2007/05/23/38fadel.h26.html&levelId=2100 Edweek.org].</ref>

===Assessment in a democratic school===
[[Sudbury model]] of democratic education schools do not perform and do not offer assessments, evaluations, transcripts, or recommendations, asserting that they do not rate people, and that school is not a judge; comparing students to each other, or to some standard that has been set is for them a violation of the student's right to privacy and to self-determination. Students decide for themselves how to measure their progress as self-starting learners as a process of self-evaluation: real life-long learning and the proper educational assessment for the 21st century, they adduce.<ref>Greenberg, D. (2000). [http://sudburyvalleyschool.org/essays/102008.shtml ''21st Century Schools,''] edited transcript of a talk delivered at the April 2000 International Conference on Learning in the 21st Century.</ref>

According to Sudbury schools, this policy does not cause harm to their students as they move on to life outside the school. However, they admit it makes the process more difficult, but that such hardship is part of the students learning to make their own way, set their own standards and meet their own goals.

The no-grading and no-rating policy helps to create an atmosphere free of competition among students or battles for adult approval, and encourages a positive cooperative environment amongst the student body.<ref>Greenberg, D. (1987). Chapter 20, ''Evaluation,'' Free at Last — The Sudbury Valley School.</ref>

The final stage of a Sudbury education, should the student choose to take it, is the graduation thesis. Each student writes on the topic of how they have prepared themselves for adulthood and entering the community at large. This thesis is submitted to the Assembly, who reviews it. The final stage of the thesis process is an oral defense given by the student in which they open the floor for questions, challenges and comments from all Assembly members. At the end, the Assembly votes by secret ballot on whether or not to award a diploma.<ref>[http://mountainlaurelsudbury.org/thesis-procedure.asp ''Graduation Thesis Procedure''], Mountain Laurel Sudbury School.</ref>

==References==
{{reflist}}

==See also==
* [[Computer aided assessment]]
* [[Confidence-Based Learning]] accurately measures a learner's knowledge quality by measuring both the correctness of his or her knowledge and the person's confidence in that knowledge.
* [[E-scape]], a technology and approach that looks specifically at the assessment of creativity and collaboration.
* [[Educational evaluation]] deals specifically with evaluation as it applies to an educational setting. As an example it may be used in the [[No Child Left Behind]] (NCLB) government program instituted by the [[government]] of the U.S.
* [[Measurement|Educational measurement]] is a process of assessment or an evaluation in which the objective is to ''quantify'' level of attainment or competence within a specified domain. See the [[Rasch model]] for measurement for elaboration on the conceptual requirements of such processes, including those pertaining to grading and use of raw scores from assessments.
* [[Educational psychology]]
* [[Electronic portfolio]] is a personal digital record containing information such as a collection of artifacts or evidence demonstrating what one knows and can do.
* [[Evaluation]] is the process of looking at what is being assessed to make sure the right areas are being considered.
* [[Grade (education)|Grading]] is the process of assigning a (possibly mutually exclusive) ranking to learners.
* [[Health Impact Assessment]] looks at the potential health impacts of policies, programs and projects.
* [[Program evaluation]] is essentially a set of philosophies and techniques to determine if a program "works".
* [[Psychometrics]], the science of measuring psychological characteristics.
* [[Rubrics for assessment]]
* [[Science, Technology, Society and Environment Education]]
* [[Social Impact Assessment]] looks at the possible social impacts of proposed new infrastructure projects, natural resource projects, or development activities.
* [[Standardized testing]] is any test that is used across a variety of schools or other situations.
* [[Standards-based assessment]]

==External links==
{{sisterlinks|Assessment}}
* [http://ahe.cqu.edu.au Assessment in Higher Education] web site.
* [http://edutopia.org/php/keyword.php?id=005 Edutopia: Assessment Overview] A collection of media and articles on the topic of assessment from The George Lucas Educational Foundation
* [http://www.apa.org/science/standards.html The Standards for Educational and Psychological Testing]
* [http://www.wmich.edu/evalctr/jc/ Joint Committee on Standards for Educational Evaluation]
* [http://focalworks.in/resources/white_papers/creating_assessments/1-1.html Creating Good MCQs] A whitepaper by Focalworks
* [http://www.scribd.com/doc/461041/Assessment-20 Assessment 2.0] Modernizing assessment

[[Category:Academic transfer]]
[[Category:Educational assessment and evaluation| ]]
[[Category:Educational psychology]]
[[Category:Evaluation methods]]
[[Category:Evaluation]]
[[Category:School terminology]]
[[Category:Thought]]
[[Category:Standards-based education]]
[[Category:Mental structures]]

[[ar:التقييم]]
[[de:Wirkungsanalyse]]
[[eu:Ebaluazio (hezkuntza)]]
[[fr:Docimologie]]
[[it:Docimologia]]
[[nl:Assessment]]
[[ja:評価]]
[[pt:Docimologia]]
[[ro:Docimologie]]
[[yi:אפשאצונג]]

Revision as of 09:08, 29 July 2010