Cumulative accuracy profile

From Wikipedia, the free encyclopedia

A cumulative accuracy profile (CAP) is a concept utilized in data science to visualize discrimination power. The CAP of a model represents the cumulative number of positive outcomes along the y-axis versus the corresponding cumulative number of a classifying parameter along the x-axis. The output is called a CAP curve.[1] The CAP is distinct from the receiver operating characteristic (ROC) curve, which plots the true-positive rate against the false-positive rate.

CAPs are used in robustness evaluations of classification models.

Analyzing a CAP[edit]

A cumulative accuracy profile can be used to evaluate a model by comparing the current curve to both the 'perfect' and a randomized curve. A good model will have a CAP between the perfect and random curves; the closer a model is to the perfect CAP, the better is.

The accuracy ratio (AR) is defined as the ratio of the area between the model CAP and random CAP, and the area between the perfect CAP and random CAP.[2] In a successful model, the AR has values between zero and one, and the higher the value is, the stronger the model.

The cumulative number of positive outcomes indicates a model's strength. For a successful model, this value should lie between 50% and 100% of the maximum, with a higher percentage for stronger models. In sporadic cases, the accuracy ratio can be negative. In this case, the model is performing worse than the random CAP.


The cumulative accuracy profile (CAP) and ROC curve are both commonly used by banks and regulators to analyze the discriminatory ability of rating systems that evaluate credit risks.[3][4] The CAP is also used by instructional design engineers to assess, retrain and rebuild instructional design models used in constructing courses, and by professors and school authorities for improved decision-making and managing educational resources more efficiently.


  2. ^ Calabrese, Raffaella (2009), The validation of Credit Rating and Scoring Models (PDF), Swiss Statistics Meeting, Geneva, Switzerland
  3. ^ Engelmann, Bernd; Hayden, Evelyn; Tasche, Dirk (2003), "Measuring the Discriminative Power of Rating Systems", Discussion Paper, Series 2: Banking and Financial Supervision (1)
  4. ^ Sobehart, Jorge; Keenan, Sean; Stein, Roger (2000-05-15), "Validation methodologies for default risk models" (PDF), Moody's Risk Management Services