where ĉ is the estimated optimal cut-point. %Bias(Ĵ) was calculated in the same manner; and
for unpooled data, and
for pooled data of size 2 or 4.
The first condition, when the number of subjects available is fixed, looks at the degradation of the estimate ĉ as pooling size increases (g = 1,2,4) resulting in a decrease in the number of tested samples—n = N/g. For instance, 40 control unpooled specimens are converted to 20-pooled specimens with each specimen consisting of a randomly chosen pair of controls (g = 2) or are converted to 10-pooled specimens with each specimen consisting of randomly chosen tetrad of controls (g = 4). The same procedure is applied to the case population.
Under normality assumptions (Table 1), the percent bias in the estimate of the optimal cut-point was negligible on all levels of discrimination and pooling, even for small sample sizes. As expected, the relative RMSE was inversely associated with the pooled size. No considerable distinction could be made between the RMSE from un-pooled data (g = 1) and pooled data (g = 2), J = 0.6, 0.8. However, for g = 4, the relative loss of efficiency is 3 times that of pairs. This is the effect of central limit theorem and is to be expected when cutting the sample by 75%.
In the gamma case (Table 2), the percent bias and relative RMSE increase in magnitude as g increases and, consequently, n and m decreases. The increase in bias due to pooling is negligible for all J = 0.4, 0.6, and 0.8. Relative RMSE increase for g = 2 are on par with that of the normal case, but g = 4 results are consistently 10% higher than the normal tetrads. The positive bias for both estimates based on the unpooled as well as on the pooled data greatly attenuates as sample size is increased. This is a result of using maximum likelihood estimators to estimate the optimal cut-point under small samples. Moreover, the bias is largely reduced for J = 0.4, 0.6, and 0.8 even for small sample size, which are actually the markers of scientific interest.
Biomarkers with poor distinguishing ability (eg, J = 0.2), also behave poorly under pooling. For example, when 40 unpooled samples are pooled in pairs, the RMSE increases by 37% for the gamma case. More generally, this relationship is true for both normal and gamma cases.
Under the second condition, when the number of assays to be performed is fixed, pooling effectively increases the overall sample size and the amount of information, via an increase in N(N = n · g) (Tables 3 and 4).
Again, bias remains unaffected, less than 1% bias for all levels of pooling for “useful” J. As pooling size increases, there is a consistent reduction in RMSE. For the normal case, as the level of pooling increases (g = 1,2,4), the RMSE for “useful” J substantially decreases (about half for pools of 4). Likewise, under gamma assumptions, as the level of pooling increases, g = 1,2,4, the benefits in RMSE are substantial (40% decrease for pools of 4). Pooling when J = 0.2 and 0.4 reveals a less dramatic benefit in RMSE.
These methods provide a useful tool for making inferences about unpooled samples when assays are based on pooled specimens. This is more clearly seen through use of an example, as illustrated below.
Evidence shows that inflammation may play a contributing role in the development of coronary heart disease (CHD). Interleukin-6 has been linked with the presence of infections in the vessel wall and with atherosclerosis.16,17 Moreover, epidemiologic data show that infection in remote sites in the etiology of CHD.
Individual measurements of interleukin-6 on 80 volunteers were obtained at Cedars-Sinai Medical Center. Forty individuals who recently (within 2 weeks from the event) survived a myocardial infarction (MI) were defined as cases, after being confirmed by rest electrocardiogram (ECG) and laboratory measurements; the remaining 40 subjects served as controls. The controls had a normal rest ECG, were free of symptoms and had no previous cardiovascular procedures or MIs. In addition, the blood specimens were randomly pooled in groups of 2 and 4, for the cases and the controls separately, and remeasured. Faraggi et al6 have shown, using the same data, that for interleukin-6 the assumption that the pooled sample measurements are the equivalent of the average of the individual cases is justified. Due to the costs involved such confirmatory evidence for the averaging assumption will generally not be available.
Distributional assumptions were also tested and found to fit well with gamma assumptions, confirming the findings of Faraggi and coauthors.6 The mean (± SD) in the control and case unpooled samples, respectively, were 1.85 (±1.37) and 4.29 (±2.18). Youden index and cut-point were estimated using the method described previously under gamma assumptions. Table 5 shows that the Youden index was approximately 0.5 for unpooled and pooled data. More importantly, the optimal cut-point was estimated to be 2.41 for unpooled data and was not very much affected by pooling, as shown in Figure 1 and 3. A 95% bootstrapped confidence interval based on unpooled data was estimated to be 1.8 to 3.6, containing both estimates (2.06 [g = 2] and 2.70 [g = 4]) based on pooled data, despite the small number of specimens.
In this paper, we have presented a method to estimate the Youden index and the optimal cut-point and extended its applications to pooled samples. We extend the work of Faraggi et al6 and Liu and Schisterman15 to the cut-point, c, and Youden Index, J, under various distributional assumptions. We have shown that pooling is a statistically viable cost-saving approach, through a reduction in the number of assays required, especially with pool sizes of 2 and 4.
Most other statistical methods currently available for the analysis of biomarkers deal with comparison of proportions between cases and controls and power analysis, eg, for a genotype.4,19 Our methods are specific for continuous data, where finding the optimal cut-point an important issue.
Relation Between Youden Index and The Likelihood Ratio
It is of interest to note that, since the Youden index of a continuous biomarker is a function of sensitivity and specificity, its relation to the likelihood ratio positive and negative may be useful. Graphically, the likelihood ratio positive (LR+) is the slope (q/(1− p)) of the line through the origin and a point on the ROC curve, while the likelihood ratio negative (LR−) is the slope ((1− q)/p) of the line through (1,1) and the same point on the ROC curve. The product of the likelihood ratios [q(1− q)/p(1 − p)] is the slope of the angle bisector. The Youden index, J, is the point at which the product of the two-likelihood ratio is equal to 1 or when the tangent to the ROC curve is parallel to the chance line (Fig. 1). Also, confidence intervals for c and J can be easily obtained using bootstrap methods and statistical software that is currently available.18
Correct implementation of the method developed in this paper requires assumptions, if the researcher sees only the pooled data. The first assumption is that the value obtained from a pooled assay can be considered to be the average of the individual values of the pooled specimens. There is both a biologic and a methodological aspect to this assumption. Biologically, this assumption can be deemed reasonable based on expert knowledge of the biomarker. If, for example, because of the molecular structure of the biomarker, pooling blood samples might yield a statistic other than the average (eg, maximum), then this methodology is inappropriate for the evaluation of the optimal cut-point and the Youden index. On the other hand, when this assumption is reasonable biologically, differences between the pooled sample and the average of individual specimens is due to “random measurement error,” defined as the random variability that led to inaccuracy in the estimation of the true mean value. For instance, if the volume of the individual specimens to be pooled is not equal, the pooled sample will result in a weighted average of the volume per value of the biomarker. Therefore, for normally distributed biomarkers, the addition of mean zero measurement error and variance σε2 will affect the estimates of ĉ one of 3 different ways depending on the ratio between σX2/σY2. If σX2/σY2 = 1, then ĉ will remain unbiased, because the location where the 2 distributions intercept would remain unchanged (Fig. 2). If σX2/σY2 > 1, then ĉ will be positively biased and similarly if σX2/σY2 < 1 then ĉ will be negatively biased. For biomarkers that follow a gamma distribution, measurement error will always cause a positive bias in ĉ. This is due to the dependent relationship between the mean and variance of gamma distributions. Also, measurement error always results in an attenuation of Ĵ. Since J is a measure of differentiation between cases and controls, it is intuitive that when error is introduced the ability to differentiate decreases.
The second assumption is that the unpooled biomarkers follow a known parametric distribution. A more formal evaluation of distributional assumption would be possible using a moment-based estimating-equation approach to deal with situations where likelihood functions based on pooled data are difficult to work with. We outlined the method to obtain estimates and test statistics of the parameters of interest in the general setting. We demonstrated the approach on the family of distributions generated by the Box-Cox transformation model, and, in the process, construct tests for goodness of fit based on the pooled data. Nevertheless, in our experience, the researcher will often develop some sense of both these assumptions during the early stages of the biomarker development by means of a validation study.
Pooling sizes of 5 and above, while fiscally attractive, are prone to 2 difficulties. The first is a consequence of the central limit theorem; averages tend to be more normally distributed as sample size increases. Identifying a biomarker's un-pooled distribution is difficult because the central limit theorem hinders our ability to distinguish between a skewed and a symmetric distribution. The second difficulty arises only when a fixed number of subjects are reduced to an unreasonably small sample size due to pooling and rendering the parameter estimation unreliable. For instance, in the example presented above, we had 40 cases and 40 controls contributing blood samples. If g = 10, then we are left with 8 assays (4 cases and 4 controls) on which to estimate the means and standard deviations necessary for ĉ and Ĵ.
This method is relevant to studies of markers for early detection and prevention of disease and for studies of markers of exposure and disease in molecular epidemiology when, for example, deciding whether a biomarker is worth pursuing further or is ready for a study. Furthermore, once this method is applied and a biomarker demonstrates discriminatory ability, the optimal cut-point can be used in clinical practice to classify patients as healthy or diseased, after proper validation.
In summary, we showed that estimating c and J under pooling is a cost-effective, statistically sound approach for evaluating biomarkers. Such estimation has potential applications for research and clinical practice and for hypothesis development.
1.Farrington C. Estimating prevalence by group testing using generalized linear models. Stat Med
2.Tu X, Litvak E, Pagano M. On the informativeness and accuracy of pooled testing in estimating prevalence of a rare disease: application to HIV screening. Biometrika
3.Barcellos L, Klitz W, Field L, et al. Association mapping of disease loci, by use of a pooled DNA genomic screen. Am J Hum Genet
4.Weinberg CR, Umbach DM. Using pooled exposure assessment to improve efficiency in case-control studies. Biometrics
5.Kemdziorski CM, Zhang Y, Lan H, Attie AD. The efficiency of pooling mRNA in micro array experiments. Biostatistics
6.Faraggi D, Reiser B, Schisterman EF. ROC curve analysis for biomarkers based on pooled assessments. Stat Med
7.Zou KH, Hall WJ, Shapiro DE. Smooth non-parametric receiver operating characteristic (ROC) curves for continuous diagnostic tests. Stat Med
8.Zweig MH, Campbell G. Receiver operator characteristic (ROC) plots; a fundamental evaluation tool in clinical medicine. Clin Chem
9.Goddard MJ, Hinbery I. Receiver operator characteristic (ROC) curves and non-normal data: an empirical study. Stat Med
10.Wieand S, Gail MH, James BR, James KL. A family of non-parametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika
11.Youden WJ. An index for rating diagnostic tests. Cancer
12.Barkan N. Statistical inference on r*specificity + sensitivity. Doctoral Dissertation 2001. Haifa University.
13.Bamber DC. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol
14.Hilden J, Glasziou P. Regret graphs, diagnostic uncertainty and Youden's Index. Stat Med
15.Liu A, Schisterman EF. Comparison of diagnostic accuracy of biomarkers with pooled assessments. Biom J
16.Chilton RJ. Recent discoveries in assessment of coronary heart disease: impact of vascular mechanisms on development of atherosclerosis. J Am Osteopath Assoc
17.Yudkin JS, Kumari M, Humphries SE, Mohamed-Ali V. Inflammation, obesity, stress and coronary heart disease: is interleukin-6 the link? Atherosclerosis
18.Carpenter J, Bithell J. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat Med
19.Peng X, Wood CL, Blalock EM, et al. Statistical implications of pooling RNA samples for microarray experiments. BMC Bioinformatics
Assume that cases, X, and controls, Y, are represented by continuous unimodal distributions, and μy < μx. Let c0 be some cut-point and ci(I = 1,2) be the ith intersection of the probability density functions denoted by f. Youden index (J) is found by
The intervals for which fy > fx and fy < fx are determined by the variances of the distributions. Assuming σx2 > σy2 could result in 1 or 2 intersections. The 2-intersection case follows
For c0 in (−∞, c1)
Similarly, for c0 in (c1, c2)
And, for c0 in (c2, ∞)
A similar argument proves that when a single intersection exists, the intersection is the cut-point for J. For the case where σx2 < σy2, this approach yields c1 as the optimal cut-point used for J.
Note: Using Figure 1 as a reference, it can be seen that moving the cut-point to the right would result in a loss in shaded area (Youden index). Since Youden index can be represented by the area between the 2 curves to either the right or left of the cut-point, moving the cut-point to the left also result in a decrease.
Supplemental Digital Content
© 2005 Lippincott Williams & Wilkins, Inc.