Attribute hierarchy method

The attribute hierarchy method (AHM), is a cognitively based psychometric procedure developed by Jacqueline Leighton, Mark Gierl, and Steve Hunka at the Centre for Research in Applied Measurement and Evaluation (CRAME) at the University of Alberta. The AHM is one form of cognitive diagnostic assessment that aims to integrate cognitive psychology with educational measurement for the purposes of enhancing instruction and student learning.^[1] A cognitive diagnostic assessment (CDA), is designed to measure specific knowledge states and cognitive processing skills in a given domain. The results of a CDA yield a profile of scores with detailed information about a student’s cognitive strengths and weaknesses. This cognitive diagnostic feedback has the potential to guide instructors, parents and students in their teaching and learning processes.

To generate a diagnostic skill profile, examinees’ test item responses are classified into a set of structured attribute patterns that are derived from components of a cognitive model of task performance. The cognitive model contains attributes, which are defined as a description of the procedural or declarative knowledge needed by an examinee to answer a given test item correctly.^[1] The inter-relationships among the attributes are represented using a hierarchical structure so the ordering of the cognitive skills is specified. This model provides a framework for designing diagnostic items based on attributes, which links examinees' test performance to specific inferences about examinees' knowledge and skills.

Differences between the AHM and the rule space method[edit]

The AHM differs from Tatsuoka's Rule Space Method (RSM)^[2] with the assumption of dependencies among the attributes within the cognitive model. In other words, the AHM was derived from RSM by assuming that some or all skills may be represented in hierarchical order. Modeling cognitive attributes using the AHM necessitates the specification of a hierarchy outlining the dependencies among the attributes. As such, the attribute hierarchy serves as a cognitive model of task performance designed to represent the inter-related cognitive processes required by examinees to solve test items. This assumption better reflects the characteristics of human cognition because cognitive processes usually do not work in isolation but function within a network of interrelated competencies and skills.^[3] In contrast, the RSM makes no assumptions regarding the dependencies among the attributes. This difference has led to the development of both IRT and non-IRT based psychometric procedures for analyzing test item responses using the AHM. The AHM also differs from the RSM with respect to the identification of the cognitive attributes and the logic underlying the diagnostic inferences made from the statistical analysis.^[4]

Identification of the cognitive attributes[edit]

The RSM uses a post-hoc approach to the identification of the attributes required to successfully solve each item on an existing test. In contrast, the AHM uses an a priori approach to identifying the attributes and specifying their interrelationships in a cognitive model.

Diagnostic inferences from statistical analysis[edit]

The RSM using statistical pattern classification where examinees' observed response patterns are matched to pre-determined response patterns that each correspond to a particular cognitive or knowledge state. Each state represents a set of correct and incorrect rules used to answer test items. The focus with the RSM is identification of erroneous rules or misconceptions. The AHM, on the other hand, uses statistical pattern recognition where examinees' observed response patterns are compared to response patterns that are consistent with the attribute hierarchy. The purpose of statistical pattern recognition is to identify the attribute combinations that the examinee is likely to possess. Hence, the AHM does not identify incorrect rules or misconceptions as in the RSM.

Principled test design[edit]

The AHM uses a construct-centered approach to test development and analysis. Construct-centered emphasizes the central role of the construct in directing test development activities and analysis. The advantage of this approach is that the inferences made about student performance are firmly grounded in the construct specified. Principled test design^[5] encompasses 3 broad stages:

cognitive model development
test development
psychometric analysis.

Cognitive model development comprises the first stage in the test design process. During this stage, the cognitive knowledge, processes, and skills are identified and organized into an attribute hierarchy or cognitive model. This stage also encompasses validation of the cognitive model prior to the test development stage.

Test development comprises the second stage in the test design process. During this stage, items are created to measure each attribute within the cognitive model while also maintaining any dependencies modeled among the attributes.

Psychometric analysis comprises the third stage in the test design process. During this stage, the fit of the cognitive model relative to observed examinee responses is evaluated to ascertain the appropriateness of the model to explain test performance. Examinee test item responses are then analyzed and diagnostic skill profiles created highlighting examinee cognitive strengths and weaknesses.

Cognitive model development[edit]

What is a cognitive model?[edit]

Visual Representation of the four general forms of hierarchical structures in a cognitive model

An AHM analysis must begin with the specification of a cognitive model of task performance. A cognitive model in educational measurement refers to a "simplified description of human problem solving on standardized educational tasks, which helps to characterize the knowledge and skills students at different levels of learning have acquired and to facilitate the explanation and prediction of students' performance".^[6] These cognitive skills, conceptualized as an attribute in the AHM framework, are specified at a small grain size in order to generate specific diagnostic inferences underlying test performance. Attributes include different procedures, skills, and/or processes that an examinee must possess to solve a test item. Then, these attributes are structured using a hierarchy so the ordering of the cognitive skills is specified.

The cognitive model can be represented by various hierarchical structures. Generally, there are four general forms of hierarchical structures that can easily be expanded and combined to form increasingly complex networks of hierarchies where the cognitive complexity corresponds to the nature of the problem solving task. The four hierarchical forms include: a) linear, b) convergent, c) divergent, and d) unstructured.

How are cognitive models created and validated?[edit]

Theories of task performance can be used to derive cognitive models of task performance in a subject domain. However, the availability of these theories of task performance and cognitive models in education are limited. Therefore, other means are used to generate cognitive models. One method is the use of a task analysis of representative test items from a subject domain. A task analysis represents a hypothesized cognitive model of task performance, where the likely knowledge and processes used to solve the test item are specified. A second method involves having examinees think aloud as they solve test items to identify the actual knowledge, processes, and strategies elicited by the task.^[7]^[8] The verbal report collected as examinees talk aloud can contain the relevant knowledge, skills, and procedures used to solve the test item. These knowledge, skills, and procedures become the attributes in the cognitive model, and their temporal sequencing documented in the verbal report provides the hierarchical ordering. A cognitive model derived using a task analysis can be validated and, if required, modified using examinee verbal reports collected from think aloud studies.

Why is the accuracy of the cognitive model important?[edit]

An accurate cognitive model is crucial for two reasons. First, a cognitive model provides the interpretative framework for linking test score interpretations to cognitive skills. That is, the test developer is in a better position to make defensible claims about student knowledge, skills, and processes that account for test performance. Second, a cognitive model provides a link between cognitive and learning psychology with instruction. Based on an examinee’s observed response pattern, detailed feedback about an examinee’s cognitive strengths and weaknesses can be provided through a score report. This diagnostic information can then be used to inform instruction tailored to the examinee, with the goals of improving or remediating specific cognitive skills.

An example of a cognitive model[edit]

The following hierarchy is an example of a cognitive model task performance for the knowledge and skills in the areas of ratio, factoring, function, and substitution (called the Ratios and Algebra hierarchy).^[9] This hierarchy is divergent and composed of nine attributes which are described below. If the cognitive model is assumed to be true, then an examinee who has mastered attribute A3 is assumed to have mastered the attributes below it, namely attributes A1 and A2. Conversely, if an examinee has mastered attribute A2, then it is expected that the examinee has mastered attribute A1 but not A3.

A Demonstration of Attributes Required to Solve Items in the Ratios and Algebra Hierarchy
Attribute	Summary of the Attribute
A1	Represents the most basic arithmetic operation skills
A2	Includes knowledge about the properties of factors
A3	Involves the skills of applying the rules of factoring
A4	Includes the skills required for substituting values into algebraic expressions
A5	Represents the skills of mapping a graph of a familiar function with its corresponding function
A6	Deals with the abstract properties of functions, such as recognizing the graphical representation of the relationship between independent and dependent variables
A7	Requires the skills to substitute numbers into algebraic expressions
A8	Represents the skills of advanced substitution – algebraic expressions, rather than numbers, need to be substituted into another algebraic expression
A9	Relates to skills associated with rule understanding and application

The hierarchy contains two independent branches which share a common prerequisite – attribute A1. Aside from attribute A1, the first branch includes two additional attributes, A2 and A3, and the second branch includes a self-contained sub-hierarchy which includes attributes A4 through A9. Three independent branches compose the sub-hierarchy: attributes A4, A5, A6; attributes A4, A7, A8; and attributes A4, A9. As a prerequisite attribute, attribute A1 includes the most basic arithmetic operation skills, such as addition, subtraction, multiplication, and division of numbers. Attributes A2 and A3 both deal with factors. In attribute A2, the examinee needs to have knowledge about the property of factors. In attribute A3, the examinee not only requires knowledge of factoring (i.e., attribute A2), but also the skills of applying the rules of factoring. Therefore, attribute A3 is considered a more advanced attribute than A2.

The self-contained sub-hierarchy contains six attributes. Among these attributes, attribute A4 is the prerequisite for all other attributes in the sub-hierarchy. Attribute A4 has attribute A1 as a prerequisite because A4 not only represents basic skills in arithmetic operations (i.e., attribute A1), but it also involves the substitution of values into algebraic expressions which is more abstract and, therefore, more difficult than attribute A1. The first branch in the sub-hierarchy deals, mainly, with functional graph reading. For attribute A5, the examinee must be able to map the graph of a familiar function with its corresponding function. In an item that requires attribute A5 (e.g., item 4), attribute A4 is typically required because the examinee must find random points in the graph and substitute the points into the equation of the function to find a match between the graph and the function. Attribute A6, on the other hand, deals with the abstract properties of functions, such as recognizing the graphical representation of the relationship between independent and dependent variables. The graphs for less familiar functions, such as a function of higher-power polynomials, may be involved. Therefore, attribute A6 is considered to be more difficult than attribute A5 and placed below attribute A5 in the sub-hierarchy.

The second branch in the sub-hierarchy considers the skills associated with advanced substitution. Attribute A7 requires the examinee to substitute numbers into algebraic expressions. The complexity of attribute A7 relative to attribute A4 lies in the concurrent management of multiple pairs of numbers and multiple equations. Attribute A8 also represents the skills of advanced substitution. However, what makes attribute A8 more difficult than attribute A7 is that algebraic expressions, rather than numbers, need to be substituted into another algebraic expression. The last branch in the sub-hierarchy contains only one additional attribute, A9, related to skills associated with rule understanding and application. It is the rule, rather than the numeric value or the algebraic expression that needs to be substituted in the item to reach a solution.

Cognitive model representation[edit]

The Ratio and Algebra attribute hierarchy can also be expressed in matrix form. To begin, the direct relationship among the attributes is specified by a binary adjacency matrix (A) of order (k,k), where k is the number of attributes, such that each element in the A matrix represents the absence (i.e., 0) or presence (i.e., 1) of a direct connection between two attributes. The A matrix for the Ratio and Algebra hierarchy presented is shown below.

${\begin{bmatrix}0&1&0&1&0&0&0&0&0\\0&0&1&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&0\\0&0&0&0&1&0&1&0&1\\0&0&0&0&0&1&0&0&0\\0&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&1&0\\0&0&0&0&0&0&0&0&0\\0&0&0&0&0&0&0&0&0\end{bmatrix}}$

Each row and column the A matrix represents one attribute; the first row and column represents attribute A1 and the last row and column represents attribute A9. The presence of a 1 in a particular row denotes a direct connection between that attribute and the attribute corresponding to the column position. For example, attribute A1 is directly connected to attribute A2 because of the presence of a 1 in the first row (i.e. attribute A1) and the second column (i.e., attribute A2). The positions of 0 in row 1 indicate that A1 is neither directly connected to itself nor to attributes A3 and A5 to A9.

The direct and indirect relationships among attributes are specified by the binary reachability matrix (R) of order (k,k), where k is the number of attributes. To obtain the R matrix from the A matrix, Boolean addition and multiplication operations are performed on the adjacency matrix, meaning $R=(A+I)^{n}$ where n is the integer required to reach invariance, $n=1,2,\cdots ,m$ , and I is the identity matrix. The R matrix for the Ratio and Algebra hierarchy is shown next.

${\begin{bmatrix}1&1&1&1&1&1&1&1&1\\0&1&1&0&0&0&0&0&0\\0&0&1&0&0&0&0&0&0\\0&0&0&1&1&1&1&1&1\\0&0&0&0&1&1&0&0&0\\0&0&0&0&0&1&0&0&0\\0&0&0&0&0&0&1&1&0\\0&0&0&0&0&0&0&1&0\\0&0&0&0&0&0&0&0&1\end{bmatrix}}$

Similar to the A matrix, each row and column in the matrix represents one attribute; the first row and column represents attribute A1 and the last row and column represents attribute A9. The first attribute is either directly or indirectly connected to all attributes A1 to A9. This is represented by the presence of 1's in all columns of row 1 (i.e., representing attribute A1). In the R matrix, an attribute is considered related to itself resulting in 1's along the main diagonal. Referring back to the hierarchy, it is shown that attribute A1 is directly connected to attribute A2 and indirectly to A3 through its connection with A2. Attribute A1 is indirectly connected to attributes A5 to A9 through its connection with A4.

The potential pool of items is represented by the incidence matrix (Q) matrix of order (k, p), where k is the number of attributes and p is number of potential items. This pool of items represents all combinations of the attributes when the attributes are independent of each other. However, this pool of items can be reduced to form the reduced incidence matrix (Q_r), by imposing the constraints of the attribute hierarchy as defined by the R matrix. The Q_r matrix represents items that capture the dependencies among the attributes defined in the attribute hierarchy. The Q_r matrix is formed using Boolean inclusion by determining which columns of the R matrix are logically included in each column of the Q matrix. The Q_r matrix is of order (k,) where k is the number of attributes and i is the reduced number of items resulting from the constraints in the hierarchy. For the Ratio and Algebra hierarchy, the Q_r matrix is shown next.

${\begin{bmatrix}1&1&1&1&1&1&1&1&1\\0&1&1&0&0&0&0&0&0\\0&0&1&0&0&0&0&0&0\\0&0&0&1&1&1&1&1&1\\0&0&0&0&1&1&0&0&0\\0&0&0&0&0&1&0&0&0\\0&0&0&0&0&0&1&1&0\\0&0&0&0&0&0&0&1&0\\0&0&0&0&0&0&0&0&1\end{bmatrix}}$

The Q_r matrix serves an important test item development blueprint where items can be created to measure each specific combination of attributes. In this way, each component of the cognitive model can be evaluated systematically. In this example, a minimum of 9 items are required to measure all the attribute combinations specified in the Q_r matrix.

The expected examinee response patterns can now be generated using the Q_r matrix. An expected examinee is conceptualized as a hypothetical examinee who correctly answers items that require cognitive attributes that the examinee has mastered. The expected response matrix (E) is created, using Boolean inclusion, by comparing each row of the attribute pattern matrix (which is the transpose of the Q_r matrix) to the columns of the Q_r matrix. The expected response matrix is of order (j,i), where j is the number of examinees and i is the reduced number of items resulting from the constraints imposed by the hierarchy. The E matrix for the Ratio and Algebra hierarchy is shown below.

${\begin{bmatrix}0&0&0&0&0&0&0&0&0\\1&0&0&0&0&0&0&0&0\\1&1&0&0&0&0&0&0&0\\1&1&1&0&0&0&0&0&0\\1&0&0&1&0&0&0&0&0\\1&1&0&1&0&0&0&0&0\\1&1&1&1&0&0&0&0&0\\1&0&0&1&1&0&0&0&0\\1&1&0&1&1&0&0&0&0\\1&1&1&1&1&0&0&0&0\\1&0&0&1&1&1&0&0&0\\1&1&0&1&1&1&0&0&0\\1&1&0&1&1&1&0&0&0\\1&0&0&1&0&0&1&0&0\\1&1&0&1&0&0&1&0&0\\1&1&1&1&0&0&1&0&0\\1&0&0&1&1&0&1&0&0\\1&1&0&1&1&0&1&0&0\\1&1&1&1&1&0&1&0&0\\1&0&0&1&1&1&1&0&0\\1&1&0&1&1&1&1&0&0\\1&1&1&1&1&1&1&0&0\\1&0&0&1&0&0&1&1&0\\1&1&0&1&0&0&1&1&0\\1&1&1&1&1&0&1&0&0\\1&0&0&1&1&0&1&1&0\\1&1&0&1&1&0&1&1&0\\1&1&1&1&1&0&1&1&0\\1&0&0&1&1&1&1&1&0\\1&1&0&1&1&1&1&1&0\\1&1&1&1&1&1&1&1&0\\1&0&0&1&0&0&0&0&1\\1&1&0&1&0&0&0&0&1\\1&1&1&1&0&0&0&0&1\\1&0&0&1&1&0&0&0&1\\1&1&0&1&1&0&0&0&1\\1&1&1&1&1&0&0&0&1\\1&0&0&1&1&1&0&0&1\\1&1&0&1&1&1&0&0&1\\1&1&1&1&1&1&0&0&1\\1&0&0&1&0&0&1&0&1\\1&1&0&1&0&0&1&0&1\\1&1&1&1&0&0&1&0&1\\1&0&0&1&1&0&1&0&1\\1&1&0&1&1&0&1&0&1\\1&1&1&1&1&0&1&0&1\\1&0&0&1&1&1&1&0&1\\1&1&0&1&1&1&1&0&1\\1&1&1&1&1&1&1&0&1\\1&0&0&1&0&0&1&1&1\\1&1&0&1&0&0&1&1&1\\1&1&1&1&0&0&1&1&1\\1&0&0&1&1&0&1&1&1\\1&1&0&1&1&0&1&1&1\\1&1&1&1&1&0&1&1&1\\1&0&0&1&1&1&1&1&1\\1&1&0&1&1&1&1&1&1\\1&1&1&1&1&1&1&1&1\end{bmatrix}}$

If the cognitive model is true, then 58 unique item response patterns should be produced by examinees who write these cognitively-based items. A row of 0s is usually added to the E matrix which represents an examinee who has not mastered any attributes. To summarize, if the attribute pattern of the examinee contains the attributes required by the item, then the examinee is expected to answer the item correctly. However, if the examinee's attribute pattern is missing one or more of the cognitive attributes required by the item, the examinee is not expected to answer the item correctly.

Test development[edit]

Role of the cognitive model in item development[edit]

The cognitive model in the form of an attribute hierarchy has direct implications for item development. Items that measure each attribute must maintain the hierarchical ordering of the attributes as specified by the cognitive model while also measuring increasingly complex cognitive processes. These item types may be in either multiple choice or constructed response format. To date, the AHM has been used with items that are scored dichotomously where 1 corresponds to a correct answer and 0 corresponds to an incorrect answer. Therefore, a student's test performance can be summarized by a vector of correct and incorrect responses in the form of 1's and 0's. This vector then serves as the input for the psychometric analysis where the examinee's attribute mastery is estimated.

Approach to item development[edit]

The attributes in the cognitive model are specified at a fine grain size in order to yield a detailed cognitive skill profile about the examinee's test performance. This necessitates many items that must be created to measure each attribute in the hierarchy. For computer-based tests, automated item generation (AIG) is a promising method for generating multiple items "on the fly" that have similar form and psychometric properties using a common template.^[10]

Example of items aligned to the attributes in a hierarchy[edit]

Referring back to the pictorial representation of Ratio and Algebra hierarchy, an item can be constructed to measure the skills described in each of the attributes. For example, attribute A1 includes the most basic arithmetic operation skills, such as addition, subtraction, multiplication, and division of numbers. An item that measures this skill could be the following: examinees are presented with the algebraic expression $4(t+u)+3=19$ , and asked to solve for (t + u). For this item, examinees need to subtract 3 from 19 and then divide 16 by 4.

Attribute A2 represents knowledge about the property of factors. An example of an item that measures this attribute is "If (p + 1)(t – 3) = 0 and p is positive, what is the value of t?” The examinee must know the property that the value of at least one factor must be zero if the product of multiple factors is zero. Once this property is recognized, the examinee would be able to recognize that because p is positive, (t – 3) must be zero to make the value of the whole expression zero, which would finally yield the value of 3 for t. To answer this item correctly, the examinee should have mastered both attributes A1 and A2.

Attribute A3 represents not only knowledge of factoring (i.e., attribute A2), but also the skills of applying the rules of factoring. An example of an item that measures this attribute is “ ${\mbox{If }}{\frac {x+y}{a-b}}={\frac {2}{3}}{\mbox{ , then }}{\frac {9x+9y}{10a-10b}}=?$ ”. Only after the examinee factors the second expression into the product of the first expression would the calculation of the value of the second expression be apparent. To answer this item correctly, the examinee should have mastered attributes A1, A2, and A3.

^[9]

Psychometric analysis[edit]

During this stage, statistical pattern recognition is used to identify the attribute combinations that the examinee is likely to possess based on the observed examinee response relative to the expected response patterns derived from the cognitive model.

Evaluating model-data fit[edit]

Prior to any further analysis, the cognitive model specified must accurately reflect the cognitive attributes used by the examinees. It is expected that there will be discrepancies, or slips, between observed response patterns generated by a large group of examinees and the expected response patterns. The fit of the cognitive model relative to the observed response patterns obtained from examinees can be evaluated using the Hierarchical Consistency Index.^[9] The HCI evaluates the degree to which the observed response patterns are consistent with the attribute hierarchy. The HCI for examinee i is given by:

HCI_{i}=1-{\frac {\sum _{j=1}^{J}\sum _{g\in s_{j}}X_{i_{j}}(1-X_{i_{g}})}{N_{c_{i}}}}

where J is the total number of items, X_{i_j} is examinee i ‘s score (i.e., 1 or 0) to item j, S_j includes items that require the subset of attributes of item j, and N_{c_i} is the total number of comparisons for correctly answered items by examinee i.

The values of the HCI range from −1 to +1. Values closer to 1 indicate a good fit between the observed response pattern and the expected examinee response patterns generated from the hierarchy. Conversely, low HCI values indicate a large discrepancy between the observed examinee response patterns and the expected examinee response patterns generated from the hierarchy. HCI values above 0.70 indicate good model-data fit.

Why is model-data fit important?[edit]

Obtaining good model-data fit provides additional evidence to validate the specified attribute hierarchy, which is required before proceeding with determination of an examinee’s attribute mastery. If the data is not shown to fit the model, then various reasons may account for the large number of discrepancies including: a misspecification of the attributes, incorrect ordering of attributes within the hierarchy, items not measuring the specified attributes, and/or the model is not reflective of the cognitive processes used by a given sample of examinees. Therefore, the cognitive model should be correctly defined and closely aligned with the observed response patterns in order to provide a substantive framework for making inferلبلبences about a specific group of examinees’ knowledge and skills.^[11] لابلب

Estimating attribute probabilities[edit]

Once we establish that the model fits the data, the attribute probabilities can be calculated. The use of attribute probabilities is important in the psychometric analyses of the AHM because these probabilities provide examinees with specific information about their attribute-level performance as part of the diagnostic reporting process. To estimate the probability that examinees possess specific attributes, given their observed item response pattern, an artificial neural network approach is used.

Brief description of a neural network[edit]

The neural network is a type of parallel-processing architecture that transforms any stimulus received by the input unit (i.e., stimulus units) to a signal for the output unit (i.e., response units) through a series of mid-level hidden units. Each unit in the input layer is connected to each unit in the hidden layer and, in turn, to each unit in the output layer.

Generally speaking, a neural network requires the following steps. To begin, each cell of the input layer receives a value (0 or 1) corresponding to the response values in the exemplar vector. Each input cell then passes the value it receives to every hidden cell. Each hidden cell forms a linearly weighted sum of its input and transforms the sum using the logistic function and passes the result to every output cell. Each output cell, in turn, forms a linearly weighted sum of its inputs from the hidden cells and transforms it using the logistic function, and outputs the result. Because the result is scaled using the logistic transformation, the output values range from 0 to 1. The result can be interpreted as the probability the correct or target value for each output will have a value of 1.

The output targets in the response units (i.e., the examinee attributes) are compared to the pattern associated with each stimulus input or exemplar (i.e., the expected response patterns). The solution produced initially with the stimulus and association connection weights is likely to be discrepant resulting in a relatively large error. However, this discrepant result can be used to modify the connection weights thereby leading to a more accurate solution and a smaller error term. One popular approach for approximating the weights so the error term is minimized is with a learning algorithm called the generalized delta rule that is incorporated in a training procedure called back propagation of error.^[12]

Specification of the neural network[edit]

Calculation of attribute probabilities begins by presenting the neural network with both the generated expected examinee response patterns from Stage 1, with their associated attribute patterns which is derived from the cognitive model (i.e., the transpose of the Q_r matrix), until the network learns each association. The result is a set of weight matrices that will be used to calculate the probability that an examinee has mastered a particular cognitive attribute based on their observed response pattern.^[9] An attribute probability close to 1 would indicate that the examinee has likely mastered the cognitive attribute, whereas a probability close to 0 would indicate that the examinee has likely not mastered the cognitive attribute.

Reporting the results[edit]

A Sample Diagnostic Score Report for an Examinee Who Mastered Attributes A1, A4, A5, and A6

The importance of the reporting process[edit]

Score reporting serves a critical function as the interface between the test developer and a diverse audience of test users. A score report must include detailed information, which is often technical in nature, about the meanings and possible interpretations of results that users can make. The Standards for Educational and Psychological Testing^[13] clearly define the role of test developers in the reporting process. Standard 5.10 states: When test score information is released to students, parents, legal representatives, teachers, clients, or the media, those responsible for testing programs should provide appropriate interpretations. The interpretations should describe in simple language what the test covers, what the scores mean, and how the scores will be used.

Reporting cognitive diagnostic results using the AHM[edit]

A key advantage of the AHM is that it supports individualized diagnostic score reporting using the attribute probability results. The score reports produced by the AHM have not only a total score but also detailed information about what cognitive attributes were measured by the test and the degree to which the examinees have mastered these cognitive attributes. This diagnostic information is directly linked to the attribute descriptions, individualized for each student, and easily presented. Hence, these reports provide specific diagnostic feedback which may direct instructional decisions. To demonstrate how the AHM can be used to report test scores and provide diagnostic feedback, a sample report is presented next.^[9] In the example to the right, the examinee mastered attributes A1 and A4 to A6. Three performance levels were selected for reporting attribute mastery: non-mastery (attribute probability value between 0.00 and 0.35), partial mastery (attribute probability value between 0.36 and 0.70), and mastery (attribute probability value between 0.71 and 1.00). The results in the score report reveal that the examinee has clearly mastered four attributes, A1 (basic arithmetic operations), A4 (skills required for substituting values into algebraic expressions), A5 (the skills of mapping a graph of a familiar function with its corresponding function), and A6 (abstract properties of functions). The examinee has not mastered the skills associated with the remaining five attributes.

Implications of AHM for cognitive diagnostic assessment[edit]

Integration of assessment, instruction, and learning[edit]

Effects of the results of a Cognitive Diagnostic Assessment

The rise in popularity of cognitive diagnostic assessments can be traced to two sources: assessment developers and assessment users.^[14] Assessment developers see great potential for cognitive diagnostic assessments to inform teaching and learning by changing the way current assessments are designed. Assessment developers also argue that to maximize the educational benefits from assessments, curriculum, instruction, and assessment design should be aligned and integrated.

Assessment users, including teachers and other educational stakeholders, are increasingly demanding relevant results from educational assessments. This requires assessments to be aligned with classroom practice to be of maximum instructional value.

The AHM to date, as a form of cognitive diagnostic assessment, addresses the path between curriculum and assessment design by identifying the knowledge, skills, and processes actually used by examinees to solve problems in a given domain. These cognitive attributes organized into a cognitive model becomes not only representation of the construct of interest, but also the cognitive test blueprint. Items can then be constructed to systematically measure each attribute combination within the cognitive model.

The path between assessment design and instruction is also addressed by providing specific, detailed feedback about an examinee's performance in terms of the cognitive attributes mastered. This cognitive diagnostic feedback is provided to students and teachers in the form of a score report. The skills mastery profile, along with adjunct information such as exemplar test items, can be used by the teacher to focus instructional efforts in areas where the student is requiring additional assistance. Assessment results can also provide feedback to the teacher on the effectiveness of instruction for promoting the learning objectives.

The AHM is a promising method for cognitive diagnostic assessment. Using a principled test design approach, integrating cognition into test development, can promote stronger inferences about how students actually think and solve problems. With this knowledge, students can be provided with additional information that can guide their learning, leading to improved performance on future educational assessments and problem solving tasks.

External links[edit]

Center for Research in Applied Measurement and Evaluation

References[edit]

^ ^a ^b Leighton, J. P., Gierl, M. J., & Hunka, S. M. (2004). The attribute hierarchy model for cognitive assessment: A variation on Tatsuoka's rule-space approach. Journal of Educational Measurement, 41, 205–237.
^ Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345–354.
^ Kuhn, D. (2001). Why development does (and does not) occur: Evidence from the domain of inductive reasoning. In J. L. McClelland & R. Siegler (Eds.), Mechanisms of cognitive development: Behavioral and neural perspectives (pp. 221–249). Hillsdale, NJ: Erlbaum.
^ Gierl, M. J. (2007). Making diagnostic inferences about cognitive attributes using the rule-space model and attribute hierarchy method. Journal of Educational Measurement, 44, 325–340.
^ Gierl, M. J., & Zhou, J. (2008). Computer adaptive-attribute testing: A new approach to cognitive diagnostic assessment. To appear in the Special Issue of Zeitschift fur Psychologie-Journal of Psychology, (Spring, 2008), Adaptive Models of Psychological Testing, Wim J. van der Linden (Guest Editor).
^ Leighton, J. P., & Gierl, M. J. (2007). Defining and evaluating models of cognition used in educational measurement to make inferences about examinees' thinking processes. Educational Measurement: Issues and Practice, 26, 3–16.
^ Ericsson, K. A. & Simon, H. A. (1993). Protocol analysis: Verbal reports as data. Cambridge, MA: MIT Press.
^ Leighton, J.P. (2004). Avoiding misconceptions, misuse, and missed opportunities: The collection of verbal reports in educational achievement testing. Educational Measurement: Issues and Practice, 23, 1–10.
^ ^a ^b ^c ^d ^e Gierl, M. J., Wang, C., & Zhou, J. (2008). Using the attribute hierarchy method to make diagnostic inferences about examinees’ cognitive skills in algebra on the SAT. Journal of Technology, Learning, and Assessment, 6 (6). Retrieved 24 October 2008, from http://www.jtla.org.
^ Bejar, I. I., Lawless, R. R., Morley, M. E., Wagner, M. E., Bennett, R. E., & Revuelta, J. (2003). A feasibility study of on-the-fly item generation in adaptive testing. Journal of Technology, Learning, and Assessment, 2 (3). Retrieved 24 October 2008, from http://www.jtla.org.
^ Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986b). Parallel distributed processing (Vol. 1). Cambridge, MA: MIT Press
^ Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986a). Learning representations by back-propagating errors. Nature, 323, 533–536
^ American Educational Research Association (AERA), American Psychological Association, National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, D.C.: AERA.
^ Huff, K., & Goodman, D. P. (2007). The demand for cognitive diagnostic assessment. In J. P. Leighton & M. J. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 19–60). Cambridge, UK: Cambridge University Press.

[Leighton-1] Leighton, J. P., Gierl, M. J., & Hunka, S. M. (2004). The attribute hierarchy model for cognitive assessment: A variation on Tatsuoka's rule-space approach. Journal of Educational Measurement, 41, 205–237.

[2] Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345–354.

[3] Kuhn, D. (2001). Why development does (and does not) occur: Evidence from the domain of inductive reasoning. In J. L. McClelland & R. Siegler (Eds.), Mechanisms of cognitive development: Behavioral and neural perspectives (pp. 221–249). Hillsdale, NJ: Erlbaum.

[4] Gierl, M. J. (2007). Making diagnostic inferences about cognitive attributes using the rule-space model and attribute hierarchy method. Journal of Educational Measurement, 44, 325–340.

[5] Gierl, M. J., & Zhou, J. (2008). Computer adaptive-attribute testing: A new approach to cognitive diagnostic assessment. To appear in the Special Issue of Zeitschift fur Psychologie-Journal of Psychology, (Spring, 2008), Adaptive Models of Psychological Testing, Wim J. van der Linden (Guest Editor).

[6] Leighton, J. P., & Gierl, M. J. (2007). Defining and evaluating models of cognition used in educational measurement to make inferences about examinees' thinking processes. Educational Measurement: Issues and Practice, 26, 3–16.

[7] Ericsson, K. A. & Simon, H. A. (1993). Protocol analysis: Verbal reports as data. Cambridge, MA: MIT Press.

[8] Leighton, J.P. (2004). Avoiding misconceptions, misuse, and missed opportunities: The collection of verbal reports in educational achievement testing. Educational Measurement: Issues and Practice, 23, 1–10.

[jtla.org-9] Gierl, M. J., Wang, C., & Zhou, J. (2008). Using the attribute hierarchy method to make diagnostic inferences about examinees’ cognitive skills in algebra on the SAT. Journal of Technology, Learning, and Assessment, 6 (6). Retrieved 24 October 2008, from http://www.jtla.org.

[10] Bejar, I. I., Lawless, R. R., Morley, M. E., Wagner, M. E., Bennett, R. E., & Revuelta, J. (2003). A feasibility study of on-the-fly item generation in adaptive testing. Journal of Technology, Learning, and Assessment, 2 (3). Retrieved 24 October 2008, from http://www.jtla.org.

[11] Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986b). Parallel distributed processing (Vol. 1). Cambridge, MA: MIT Press

[12] Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986a). Learning representations by back-propagating errors. Nature, 323, 533–536

[13] American Educational Research Association (AERA), American Psychological Association, National Council on Measurement in Education. (1999). Standards for Educational and Psychological Testing. Washington, D.C.: AERA.

[14] Huff, K., & Goodman, D. P. (2007). The demand for cognitive diagnostic assessment. In J. P. Leighton & M. J. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 19–60). Cambridge, UK: Cambridge University Press.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]