Sensitivity and specificity are two components that measure the inherent validity of a diagnostic test compared to the gold standard; a valid test would not only correctly detect the presence of disease but also correctly detect the absence of the disease in subjects with and without disease, respectively. Sensitivity and specificity are useful measures when the established gold standard is difficult to adopt in practice. For example, diagnosis of pancreatic carcinoma can be confirmed only by laprotomy for alive or by autopsy for dead patients. Sometimes the gold standard is expensive, less widely available, more invasive, riskier, and takes more time to produce results. Such issues compel researchers to develop new diagnostic methods as surrogate to the gold standard.
An adequate sample size is needed to ensure that the study will yield estimate of the sensitivity and specificity with acceptable precision—smaller sample size produces imprecise estimate, and unduly large sample is wastage of resources especially when the new method is expensive. Furthermore, the prevalence of disease was included in the sample size formula by Buderer, because the sample size without considering the prevalence would be adequate either for sensitivity or for specificity but not for both.
In practice, researchers generally decide a sample size for validating a new diagnostic test arbitrarily or at their convenience or use the previous literature. A study was conducted by Bochmann in five highest impact factor ophthalmology journals to assess the sample size calculation in diagnostic accuracy articles published in 2005 and found only 1 out of 40 studies reporting the sample size calculation before initiating the study. This may be due to reluctance in using a mathematical formula or computer software. Buderer provides the sample size tables for sensitivity and specificity but they are only for the 10% precision level. Carley et al. have provided nomograms but they are separate for sensitivity and specificity. They derived them only for the 95% level of confidence; too many lines and curves make their nomogram complex to read.
A nomogram is a chart consisting of three or more lines or curves so arranged that the required reading can be quickly made by just moving the ruler. They are still very popular in spite of the availability of computer. One of the main attractions is that a nomogram can be carried anywhere since it is just a piece of paper and can be repeatedly used without redoing the calculations. Various nomograms have been devised such as to calculate the sample size in diagnostic studies, to find the number of clusters required for estimating the prevalence rate in single-stage cluster-sample survey, and to find the number needed to treat in a therapeutic trial against values of absolute risk in the absence of treatment.[3–5]
We have devised a relatively very simple nomogram to read the sample size for anticipated sensitivity and specificity using the formula described by Buderer. This guides the researchers about the adequate sample size to achieve specified absolute precision. The estimated prevalence of disease and confidence level 100(1 – α)% are required. The features of this nomogram are as follows: (i) a single nomogram can be used to read the sample size for both sensitivity and specificity, (ii) it is based on simple lines instead of curves, (iii) it is easy to read by just moving the ruler from one point to another, (iv) the sample size for the 95% confidence level is directly available and one can calculate the sample size for 99% and 90% levels of confidence just multiplying by 1.75 and 0.70, respectively to sample size obtained by using 95% confidence level, and (iv) the sample size can be obtained for any precision level with minor calculations.
Materials and Methods
Sample size at the required absolute precision level for sensitivity and specificity can be calculated by Buderer’s formula:
where n = required sample size,
SN = anticipated sensitivity,
SP = anticipated specificity,
α = size of the critical region (1 – α is the confidence level),
z1-α/2 = standard normal deviate corresponding to the specified size of the critical region (α), and
L = absolute precision desired on either side (half-width of the confidence interval) of sensitivity or specificity.
The procedure to construct a nomogram is described by Adam and Molnar. Our nomogram is depicted in Fig. 1. This was created in MS Excel. This nomogram is for the 95% confidence level and consists of five parallel lines. The first line depicts anticipated sensitivity or specificity of the diagnostic test that can vary from 0.70 to 0.97. A test with anticipated sensitivity or specificity less than 0.70 may not be worthy of investigations. The minimum value of L on either side of anticipated sensitivity or specificity is taken as 0.03. The second line depicts the number of subjects required at 0.03 and 0.05 absolute precision and the third line depicts the number of subjects for 0.07 and 0.10 absolute precision. Fourth and fifth lines are prevalence lines and represent the expected prevalence of disease; the fourth line is to be used for L = 0.03 or 0.05 and the fifth for L = 0.07 or 0.10.
To find the number of subjects required for estimating sensitivity, place a ruler joining the anticipated sensitivity with expected prevalence and read the number of subjects where the ruler cuts the corresponding line of the number of subjects with required absolute precision. One should choose anticipated sensitivity such that after adding the required precision it does not exceed 1. For example, when anticipated sensitivity is 0.96, a researcher cannot select required precision to be more than 0.04.
Suppose the researcher selects anticipated sensitivity SN = 0.80, precision = 0.03 with 95% confidence level (two-tailed), i.e., SN can be from 0.77 to 0.83, and expected prevalence = 0.20. Place a ruler joining the point 0.80 on the anticipated sensitivity/specificity line to point 0.20 on the estimated prevalence line of 0.03 absolute precision and read the required sample size from the number of subjects line of 0.03 absolute precision. In our example, the number of subjects required is nearly 3450 as shown in Fig. 1. By formula, the exact value is 3415–a difference of nearly 1%. This can happen with any nomogram.
To find the required sample size for estimating specificity, first subtract the expected prevalence from 1 and place the ruler joining the anticipated specificity to (1 – prevalence) value on the prevalence line of required precision. For example, if SP= 0.80, precision = 0.05 with 95% confidence level, and prevalence is 0.20, join the point Sp = 0.80 with the point (1 – 0.20) = 0.80, on the prevalence line of 0.05 absolute precision, and read the sample size from the number of subjects line for 0.05 absolute precision This is nearly 300. By calculation, the exact value is 308. Now the difference is 3%.
The final sample size depends on the interest of the researcher. If sensitivity and specificity are equally important for the study, determine the sample size for both sensitivity and specificity, separately. The final sample size of the study would be the larger of these two. But sometimes the researcher is interested more in sensitivity than specificity. In that case, the final sample size would be based on the sensitivity only. In addition, there are other considerations such as nonresponse and subgroup analysis.
It is easily seen in the formula that the number of subjects is exactly four times when the length of L is halved, and one-fourth when the length of L is doubled, provided other values remain same. Following expression can be used to obtain the number of subjects needed for any precision level L1
where n0= sample size at precision level L0 from the nomogram where L0 may be 0.03, 0.05, 0.07, and 0.10 as depicted in our nomogram and n1= sample size at precision level L1; L1 may be any other acceptable precision level. Thus this nomogram can in fact be used for any precision level with minor calculation as envisaged in equation (1). Similarly the researcher can also use the nomogram for 99% and 90% confidence levels. To find the sample size for 99% and 90% confidence levels, first read the number of subjects required assuming the 95% confidence level and then multiply it with 1.75 for the 99% confidence level and 0.70 for the 90% confidence level. This is the ratio of the square of the standard normal deviate for the required confidence level 100(1 – α)% to the standard normal deviate for the 95% confidence level.
To validate the nomogram, various parameter combinations such as anticipated sensitivity/specificity, and prevalence of the disease were randomly selected. The exact sample size was calculated by the formula while at the same time independently second author determined the sample size from a nomogram for these randomly selected combinations of parameters. The percentage error was calculated (Tables 1 and 2 in Appendix). The percentage error is higher when the sample size is small; for instance, the exact sample size for specificity = 0.97, prevalence = 0.60, and absolute precision = 0.10 is 28 while the nomogram shows this as 30 [Table 2]. The difference of 2 is small although percentage is 7.14%. Otherwise the sample size is within 5% of the exact value. As already stated, this kind of minor approximation is inevitable with any nomogram as it simplifies the process.
A nomogram depicts the mathematical relationship among various parameters and is simple to use without redoing the calculations. Our nomogram has four parameters—anticipated sensitivity/specificity, number of subjects, absolute precision level, and expected prevalence of disease. Researchers can also use this nomogram for reverse calculation. If any of these three parameters are known, the fourth parameter can be obtained. This nomogram does not incorporate Type II error; thus this cannot be used for testing the hypothesis on sensitivity/specificity.
One of the main limitations of any nomogram is reading accuracy. In place of 465, one might read 460 from the line but this minor deviation may not be important in practice. This nomogram is applicable only when both the new diagnostic test and gold standard provide result in a dichotomous category such as test+ and test−. Thus this is not applicable when the gold standard is dichotomous and the new diagnostic test is ordinal or continuous, or vice versa.
1. Buderer NM. Statistical methodology: I Incorporating the prevalence of disease into the sample size calculation for sensitivity and specificity
Acad Emerg Med. 1996;3:895–900
2. Bochmann F, Johnson Z, Azuara-Blanco A. Sample size in studies on diagnostic accuracy in ophthalmology: A literature survey Br J Ophthalmol. 2007;91:898–900
3. Carley S, Dosman S, Jones SR, Harrison M. Simple nomograms to calculate simple size in diagnostic studies Emerg Med J. 2005;22:180–1
4. Kumar R, Indrayan A. A nomogram for single-stage-cluster-sampling survey in a community for estimation of a prevalence rate Int J Epidemiol. 2002;31:463–7
5. Chatellier G, Zapletal E, Lemaitre D, Medard J, Degoulet P. The number needed to treat: A clinically useful nomogram in its proper context Br Med J. 1996;312:426–9
6. Adams DP. Nomography: Theory and Application 1964 Hamden Connecticut Archon Books:1–17
7. Molnar J. Nomographs: What they are and how to use them. Ann Arbor Ann Arbor Sciences. 1981
8. Indrayan A. Confidence intervals and principal of significance. Medical Biostatistics nd ed. 19642nd ed Boca Raton Chapman and Hall/ CRC Press:380–2
Source of Support: Nil
Conflict of Interest: None declared.