A concept inventory is a criterion-referenced test designed to help determine whether a student has an accurate working knowledge of a specific set of concepts. Historically, concept inventories have been in the form of multiple-choice tests in order to aid interpretability and facilitate administration in large classes. Unlike a typical, teacher-authored multiple-choice test, questions and response choices on concept inventories are the subject of extensive research. The aims of the research include ascertaining (a) the range of what individuals think a particular question is asking and (b) the most common responses to the questions. Concept inventories are evaluated to ensure test reliability and validity. In its final form, each question includes one correct answer and several distractors.
Ideally, a score on a criterion-referenced test reflects the amount of content knowledge a student has mastered. Criterion-referenced tests differ from norm-referenced tests in that (in theory) the former is not used to compare an individual's score to the scores of the group. Ordinarily, the purpose of a criterion-referenced test is to ascertain whether a student mastered a predetermined amount of content knowledge; upon obtaining a test score that is at or above a cutoff score, the student can move on to study a body of content knowledge that follows next in a learning sequence. In general, item difficulty values ranging between 30% and 70% are best able to provide information about student understanding.
The distractors are incorrect or irrelevant answers that are usually (but not always) based on students' commonly held misconceptions. Test developers often research student misconceptions by examining students' responses to open-ended essay questions and conducting "think-aloud" interviews with students. The distractors chosen by students help researchers understand student thinking and give instructors insights into students' prior knowledge (and, sometimes, firmly held beliefs). This foundation in research underlies instrument construction and design, and plays a role in helping educators obtain clues about students' ideas, scientific misconceptions, and didaskalogenic ("teacher-induced" or "teaching-induced") confusions and conceptual lacunae that interfere with learning.
Concept inventories in use
Concept inventories are education-related diagnostic tests. In 1985 Halloun and Hestenes introduced a "multiple-choice mechanics diagnostic test" to examine students' concepts about motion. It evaluates student understanding of basic concepts in classical (macroscopic) mechanics. A little later, the Force Concept Inventory (FCI), another concept inventory, was developed. The FCI was designed to assess student understanding of the Newtonian concepts of force. Hestenes (1998) found that while "nearly 80% of the [students completing introductory college physics courses] could state Newton's Third Law at the beginning of the course. FCI data showed that less than 15% of them fully understood it at the end".These results have been replicated in a number of studies involving students at a range of institutions (see sources section below). That said, there remains questions as what exactly the FCI measures. Results from Hake (1998) using the FCI have led to greater recognition in the science education community of the importance of students' "interactive engagement" with the materials to be mastered.
Since the development of the FCI, other physics instruments have been developed. These include the Force and Motion Conceptual Evaluation developed by Thornton and Sokoloff and the Brief Electricity and Magnetism Assessment developed by Ding et al. For a discussion of how a number of concept inventories were developed see Beichner. Information about physics concept tests can be found at the NC State Physics Education Research Group website (see the external links below).
In addition to physics, concept inventories have been developed in statistics, chemistry, astronomy, basic biology, natural selection, genetics, engineering, geoscience. and computer science.
In many areas, foundational scientific concepts transcend disciplinary boundaries. An example of an inventory that assesses knowledge of such concepts is an instrument developed by Odom and Barrow (1995) to evaluate understanding of diffusion and osmosis. In addition, there are non-multiple choice conceptual instruments, such as the essay-based approach suggested by Wright et al. (1998) and the essay and oral exams used by Nehm and Schonfeld (2008). and Cooper et al  to measure student understanding of Lewis structures in chemistry.
Caveats associated with concept inventory use
Some concept inventories are problematic. The concepts tested may not be fundamental or important in a particular discipline, the concepts involved may not be explicitly taught in a class or curriculum, or answering a question correctly may require only a superficial understanding of a topic. It is therefore possible to either over-estimate or under-estimate student content mastery. While concept inventories designed to identify trends in student thinking may not be useful in monitoring learning gains as a result of pedagogical interventions, disciplinary mastery may not be the variable measured by a particular instrument. Users should be careful to ensure that concept inventories are actually testing conceptual understanding, rather than test-taking ability, language skills, or other abilities that can influence test performance.
The use of multiple-choice exams as concept inventories is not without controversy. The very structure of multiple-choice type concept inventories raises questions involving the extent to which complex, and often nuanced situations and ideas must be simplified or clarified to produce unambiguous responses. For example, a multiple-choice exam designed to assess knowledge of key concepts in natural selection does not meet a number of standards of quality control. One problem with the exam is that the two members of each of several pairs of parallel items, with each pair designed to measure exactly one key concept in natural selection, sometimes have very different levels of difficulty. Another problem is that the multiple-choice exam overestimates knowledge of natural selection as reflected in student performance on a diagnostic essay exam and a diagnostic oral exam, two instruments with reasonably good construct validity. Although scoring concept inventories in the form of essay or oral exams is labor-intensive, costly, and difficult to implement with large numbers of students, such exams can offer a more realistic appraisal of the actual levels of students' conceptual mastery as well as their misconceptions. Recently, however, computer technology has been developed that can score essay responses on concept inventories in biology and other domains (Nehm, Ha, & Mayfield, 2011), promising to facilitate the scoring of concept inventories organized as (transcribed) oral exams as well as essays.
- Authentic assessment – The measurement of "intellectual accomplishments that are worthwhile, significant, and meaningful"
- Classical test theory
- Concept map – Diagram showing relationships among concepts
- Conceptual question
- Confidence-based learning – System which distinguishes between what learners think and actually know
- Construct validity
- Constructive alignment
- Criterion-referenced test
- Educational assessment – Systematic process of documenting and using empirical data on the knowledge, skill, attitudes, and beliefs to refine programs and improve student learning
- Item response theory – Paradigm for the design, analysis, and scoring of tests
- Norm-referenced test
- Ontology (information science) – Specification of a conceptualization
- Psychometrics – theory and technique of psychological measurement
- Rubrics for assessment – Scoring guide for assessment
- Standards-based education reform in the United States
- Standardized test – Test administered and scored in a predetermined, standard manner
- Standards-based assessment – Assessment based on specified standards
- "Development and Validation of Instruments to Measure Learning of Expert-Like Thinking." W. K. Adams & C. E. Wieman, 2010. International Journal of Science Education, 1-24. iFirst, doi:10.1080/09500693.2010.512369
- Treagust, David F. (1988). "Development and use of diagnostic tests to evaluate students' misconceptions in science". International Journal of Science Education. Informa UK Limited. 10 (2): 159–169. Bibcode:1988IJSEd..10..159T. doi:10.1080/0950069880100204. ISSN 0950-0693.
- Hallouin, I. A., & Hestenes, D. Common sense concepts about motion (1985). American Journal of Physics, 53, 1043-1055
- Hestenes, David; Wells, Malcolm; Swackhamer, Gregg (1992). "Force concept inventory" (PDF). The Physics Teacher. American Association of Physics Teachers (AAPT). 30 (3): 141–158. Bibcode:1992PhTea..30..141H. doi:10.1119/1.2343497. ISSN 0031-921X.
- Hestenes, David (1998). "Who needs physics education research!?". American Journal of Physics. American Association of Physics Teachers (AAPT). 66 (6): 465–467. Bibcode:1998AmJPh..66..465H. doi:10.1119/1.18898. ISSN 0002-9505.
- Huffman, Douglas; Heller, Patricia (1995). "What does the force concept inventory actually measure?" (PDF). The Physics Teacher. American Association of Physics Teachers (AAPT). 33 (3): 138–143. Bibcode:1995PhTea..33..138H. doi:10.1119/1.2344171. ISSN 0031-921X.
- Hake, Richard R. (1998). "Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses". American Journal of Physics. American Association of Physics Teachers (AAPT). 66 (1): 64–74. Bibcode:1998AmJPh..66...64H. doi:10.1119/1.18809. ISSN 0002-9505.
- Redish page. Visited Feb. 14, 2011
- Thornton, Ronald K.; Sokoloff, David R. (1998). "Assessing student learning of Newton's laws: The Force and Motion Conceptual Evaluation and the Evaluation of Active Learning Laboratory and Lecture Curricula". American Journal of Physics. American Association of Physics Teachers (AAPT). 66 (4): 338–352. Bibcode:1998AmJPh..66..338T. doi:10.1119/1.18863. ISSN 0002-9505.
- Ding, L, Chabay, R, Sherwood, B, & Beichner, R (2006). Evaluating an electricity and magnetism assessment tool: Brief electricity and magnetism assessment Brief Electricity and Magnetism Assessment (BEMA). Phys. Rev. ST Physics Ed. Research 2, 7 pages. Ding, Lin; Chabay, Ruth; Sherwood, Bruce; Beichner, Robert (2006). "Evaluating an electricity and magnetism assessment tool: Brief electricity and magnetism assessment". Physical Review Special Topics: Physics Education Research. 2 (1): 010105. Bibcode:2006PRPER...2a0105D. doi:10.1103/PhysRevSTPER.2.010105.
- Beichner, Robert J. (1994). "Testing student interpretation of kinematics graphs". American Journal of Physics. American Association of Physics Teachers (AAPT). 62 (8): 750–762. Bibcode:1994AmJPh..62..750B. doi:10.1119/1.17449. ISSN 0002-9505.
- Allen, K (2006) The Statistics Concept Inventory: Development and Analysis of a Cognitive Assessment Instrument in Statistics. Doctoral dissertation, The University of Oklahoma. 
- "The Chemical Concepts Inventory. Visited Feb. 14, 2011". Archived from the original on 2007-07-18. Retrieved 2007-07-30.
- Wampold, Bruce E.; Wright, John C.; Williams, Paul H.; Millar, Susan B.; Koscuik, Steve A.; Penberthy, Debra L. (1998). "A Novel Strategy for Assessing the Effects of Curriculum Reform on Student Competence" (PDF). Journal of Chemical Education. American Chemical Society (ACS). 75 (8): 986–992. Bibcode:1998JChEd..75..986W. doi:10.1021/ed075p986. ISSN 0021-9584.
-  Astronomy Diagnostic Test (ADT) Version 2.0, visited Feb. 14, 2011
- Garvin-Doxas, Kathy; Klymkowsky, Michael W. (2008). Alberts, Bruce (ed.). "Understanding Randomness and its Impact on Student Learning: Lessons Learned from Building the Biology Concept Inventory (BCI)". CBE: Life Sciences Education. American Society for Cell Biology (ASCB). 7 (2): 227–233. doi:10.1187/cbe.07-08-0063. ISSN 1931-7913. PMC 2424310. PMID 18519614.
- D'Avanzo, Charlene (2008). "Biology Concept Inventories: Overview, Status, and Next Steps". BioScience. Oxford University Press (OUP). 58 (11): 1079–1085. doi:10.1641/b581111. ISSN 1525-3244.
- D'Avanzo C, Anderson CW, Griffith A, Merrill J. 2010. Thinking like a biologist: Using diagnostic questions to help students reason with biological principles. (17 January 2010; www.biodqc.org/)
- Wilson, Christopher D.; Anderson, Charles W.; Heidemann, Merle; Merrill, John E.; Merritt, Brett W.; et al. (2006). "Assessing Students' Ability to Trace Matter in Dynamic Systems in Cell Biology". CBE: Life Sciences Education. American Society for Cell Biology (ASCB). 5 (4): 323–331. doi:10.1187/cbe.06-02-0142. ISSN 1931-7913. PMC 1681358. PMID 17146039.
- Anderson, Dianne L.; Fisher, Kathleen M.; Norman, Gregory J. (2002-11-14). "Development and evaluation of the conceptual inventory of natural selection". Journal of Research in Science Teaching. Wiley. 39 (10): 952–978. Bibcode:2002JRScT..39..952A. doi:10.1002/tea.10053. ISSN 0022-4308.
- Nehm, Ross H.; Schonfeld, Irvin Sam (2008). "Measuring knowledge of natural selection: A comparison of the CINS, an open-response instrument, and an oral interview" (PDF). Journal of Research in Science Teaching. Wiley. 45 (10): 1131–1160. Bibcode:2008JRScT..45.1131N. doi:10.1002/tea.20251. ISSN 0022-4308. Archived from the original (PDF) on 2011-05-17.
- Nehm R & Schonfeld IS (2010). The future of natural selection knowledge measurement: A reply to Anderson et al. (2010). Journal of Research in Science Teaching, 47, 358-362.  Archived 2011-07-19 at the Wayback Machine
- Smith, Michelle K.; Wood, William B.; Knight, Jennifer K. (2008). Ebert-May, Diane (ed.). "The Genetics Concept Assessment: A New Concept Inventory for Gauging Student Understanding of Genetics". CBE: Life Sciences Education. American Society for Cell Biology (ASCB). 7 (4): 422–430. doi:10.1187/cbe.08-08-0045. ISSN 1931-7913. PMC 2592048. PMID 19047428.
- Concept Inventory Assessment Instruments for Engineering Science. Visited Feb. 14, 2011. 
- Libarkin, J.C., Ward, E.M.G., Anderson, S.W., Kortemeyer, G., Raeburn, S.P., 2011, Revisiting the Geoscience Concept Inventory: A call to the community: GSA Today, v. 21, n. 8, p. 26-28.  Archived 2013-07-26 at the Wayback Machine
- Caceffo, R.; Wolfman, S.; Booth, K.; Azevedo, R. (2016). Developing a Computer Science Concept Inventory for Introductory Programming. In Proceedings of the 47th ACM Technical Symposium on Computing Science Education (SIGCSE '16). ACM, New York, NY, USA, 364-369. DOI=https://dx.doi.org/10.1145/2839509.2844559 
- Odom AL, Barrow LH 1995 Development and application of a two-tier diagnostic test measuring college biology students' understanding of diffusion and osmosis after a course of instruction. Journal of Research In Science Teaching 32: 45-61.
- Cooper, Melanie M.; Underwood, Sonia M.; Hilley, Caleb Z. (2012). "Development and validation of the implicit information from Lewis structures instrument (IILSI): do students connect structures with properties?". Chem. Educ. Res. Pract. Royal Society of Chemistry (RSC). 13 (3): 195–200. doi:10.1039/c2rp00010e. ISSN 1109-4028.
- Nehm, R.H., Ha, M., Mayfield, E. (in press). Transforming Biology Assessment with Machine Learning: Automated Scoring of Written Evolutionary Explanations. Journal of Science Education and Technology.
- Biology Concept Inventory
- Bio-Diagnostic Question Clusters
- Classroom Concepts and Diagnostic Tests
- Diagnostic Question Clusters in Biology
- Evolution Assessment
- Force Concept Inventory
- Molecular Life Sciences Concept Inventory
- Thinking Like a Biologist