ROC curves are frequently used to show in a graphical way the connection/trade-off between clinical sensitivity and specificity for every possible cut-off for a test or a combination of tests. In addition the area under the ROC curve gives an idea about the benefit of using the test(s) in question.
ROC curves are used in clinical biochemistry to choose the most appropriate cut-off for a test. The best cut-off has the highest true positive rate together with the lowest false positive rate.
As the area under an ROC curve is a measure of the usefulness of a test in general, where a greater area means a more useful test, the areas under ROC curves are used to compare the usefulness of tests.
The term ROC stands for Receiver Operating Characteristic.
ROC curves were first employed in the study of discriminator systems for the detection of radio signals in the presence of noise in the 1940s, following the attack on Pearl Harbor.
The initial research was motivated by the desire to determine how the US RADAR “receiver operators” had missed the Japanese aircraft.
Now ROC curves are frequently used to show the connection between clinical sensitivity and specificity for every possible cut-off for a test or a combination of tests. In addition, the area under the ROC curve gives an idea about the benefit of using the test(s) in question.
HOW TO MAKE A ROC CURVE
To make an ROC curve you have to be familiar with the concepts of true positive, true negative, false positive and false negative. These concepts are used when you compare the results of a test with the clinical truth, which is established by the use of diagnostic procedures not involving the test in question.
Before you make a table like Table 1 you have to decide your cut-off for distinguishing healthy from sick. The cut-off determines the clinical sensitivity (fraction of true positives to all with disease) and specificity (fraction of true negatives to all without disease). When you change the cut-off, you will get other values for true positives and negatives and false positives and negatives, but the number of all with disease is the same and so is the number of all without disease. Thus you will get an increase in sensitivity or specificity at the expense of lowering the other parameter when you change the cut-off.1
TABLE 1: Comparing a Method With the Clinical Truth
Figure 1 and Figure 2 demonstrate the trade-off between sensitivity and specificity. When 400 μg/L is chosen as the analyte concentration cut-off, the sensitivity is 100% and the specificity is 54%. When the cut-off is increased to 500 μg/L, the sensitivity decreases to 92% and the specificity increases to 79%.
FIGURE 1: Cut-off = 400 μg/L.
FIGURE 2: Cut-off = 500 μg/L.
FIGURE 3: First point on the ROC curve.
FIGURE 4: Second point on the ROC curve.
FIGURE 5: Third point on the ROC curve.
FIGURE 6: Points #50 and #100 on the ROC curve.
FIGURE 7: The finalized ROC curve.
FIGURE 8: Area under ROC curve.
FIGURE 9: No overlap between healthy and sick.
FIGURE 10: ROC curve for a test with no overlap between healthy and sick.
FIGURE 11: Complete overlap between healthy and sick.
FIGURE 12: ROC curve for a test with complete overlap between healthy and sick.
An ROC curve shows the relationship between clinical sensitivity and specificity for every possible cut-off. The ROC curve is a graph with:
- The x-axis showing 1 – specificity (= false positive fraction = FP/(FP+TN))
- The y-axis showing sensitivity (= true positive fraction = TP/(TP+FN))
Thus every point on the ROC curve represents a chosen cut-off even though you cannot see this cut-off. What you can see is the true positive fraction and the false positive fraction that you will get when you choose this cut-off.
To make an ROC curve from your data you start by ranking all the values and linking each value to the diagnosis – sick or healthy.
In the example in Table 2 159 healthy people and 81 sick people are tested. The results and the diagnosis (sick Y or N) are listed and ranked based on parameter concentration.
TABLE 2: Ranked Data With Diagnosis (Yes/No)
TABLE 3: Ranked Data With Calculated True Positive and False Positive Rates for a Scenario Where the Specific Value is Used as Cut-Off
For each and every concentration it is calculated what the clinical sensitivity (true positive rate) and the (1 – specificity) (false positive rate) of the assay will be if a result identical to this value or above is considered positive.
Now the curve is constructed by plotting the data pairs for sensitivity and (1 – specificity):
AREA UNDER ROC CURVE
The area under the ROC curve (AUROC) of a test can be used as a criterion to measure the test’s discriminative ability, i.e. how good is the test in a given clinical situation.
Various computer programs can automatically calculate the area under the ROC curve. Several methods can be used. An easy way to calculate the AUROC is to use the trapezoid method. To explain it simply, the sum of all the areas between the x-axis and a line connecting two adjacent data points is calculated:
;)
THE PERFECT TEST
A perfect test is able to discriminate between the healthy and sick with 100% sensitivity and 100% specificity.
It will have an ROC curve that passes through the upper left corner (∼100% sensitivity and 100% specificity). The area under the ROC curve of the perfect test is 1.
THE WORTHLESS TEST
When we have a complete overlap between the results from the healthy and the results from the sick population, we have a worthless test. A worthless test has a discriminating ability equal to flipping a coin.
The ROC curve of the worthless test falls on the diagonal line. It includes the point with 50% sensitivity and 50% specificity. The area under the ROC curve of the worthless test is 0.5.
COMPARING ROC CURVES
As mentioned above, the area under the ROC curve of a test can be used as a criterion to measure the test’s discriminative ability, i.e. how good is the test in a given clinical situation. Generally, tests are categorized based on the area under the ROC curve.
The closer an ROC curve is to the upper left corner, the more efficient is the test.
In Figure 13 test A is superior to test B because at all cut-offs the true positive rate is higher and the false positive rate is lower than for test B. The area under the curve for test A is larger than the area under the curve for test B.
FIGURE 13: ROC curves for tests A and B.
As a rule of thumb the categorizations in Table 4 can be used to describe an ROC curve.
TABLE 4: Categorization of ROC Curves
REFERENCE
1. CLSI/NCCLS document EP12-A2 User Protocol for Evaluation of Qualitative Test Performance; Approved Guideline 2nd Edition. Vol. 28 No. 3. 2008.