Investigators interested in measuring behavioral or psychological constructs, especially in children or adolescents, often collect information from multiple informants or sources. For example, investigators studying children may obtain reports from parents, teachers, and the child himself if he is old enough. The resulting multiple informant data are generally discrepant. In meta-analyses examining cross-informant correlations, the mean correlation between child report and other report of child psychopathology was 0.22,^{1} and the mean correlations for cross-informant reports of adult psychopathology based on the same instrument ranged between 0.40 and 0.70.^{2} Discrepancies between informants are thought to reflect random measurement error, systematic measurement error (or bias), and the different perspectives of the informants (eg, home vs. school). In fact, discrepancy is the very reason for seeking reports from multiple sources: when information from multiple informants is combined, it is believed that different contexts and perspectives will be represented.^{1}

A considerable amount of research has focused on using multiple informant data as outcomes,^{3–5} but we focus here on using them as predictors.^{6} More specifically, we focus on the situation where it is assumed that there is an unobserved variable (the “true predictor”) underlying the informants' reports (which themselves are not perfect or near perfect (“gold-standard”) measures of the underlying variable), and where interest lies in estimating the effect of the true predictor on a continuous outcome. Our focus on this situation is motivated by a real data example^{7} where mother and child reports of the child's daily vigorous physical activity are used to estimate the effect of such physical activity on the child's body mass index (BMI). In situations where interest lies in modeling the effect of the informants' reports themselves on the outcome (eg, when informants' perceptions of the true predictor are thought to be informative about the outcome^{8} ), it is common to perform separate regressions or include all the reports simultaneously as predictors in a regression model.^{6} In contrast, modeling the effect of the true predictor on the outcome is more complicated because the multiple informants' reports must be combined into a single measure of the true predictor. Further, the reports will ideally be combined in a way that corrects for measurement error. It is well known that failing to correct for measurement error in predictors can result in biased estimates and, when there are other predictors, invalid confidence intervals that do not have the nominal coverage properties.^{9}

We compare 5 ways to combine multiple informant data to estimate a linear regression or correlation coefficient for the effect of the true predictor on the continuous outcome. First, we describe in detail how to implement these approaches. Then, we describe simulation experiments that compare the performance of these approaches in terms of bias and mean squared error in situations where the true predictor is a mixture of zeros and continuously distributed positive values, as in the physical activity and BMI example. (This type of predictor is a commonly encountered example of a “semicontinuous” variable.) Finally, we compare the estimates produced by these approaches for the aforementioned physical activity and BMI example.

OVERVIEW OF APPROACHES
We assume the following model for the relationship of interest:

where T_{i} is the true predictor and Y_{i} is the continuous outcome for the i th individual, and the ε_{i} s are independently and identically distributed with mean 0 and constant variance σ_{ε} ^{2} . In some instances, the parameter of interest will be β. In the example of vigorous physical activity and BMI, we might be interested in the expected change in BMI for one extra hour of vigorous physical activity. In other instances, the parameter of interest will be ρ, the (Pearson) correlation coefficient between Y and T . For example, if Y and T are not measured in meaningful units, ρ might be preferred because it is scale-invariant.

However, because T cannot be observed directly, information about β or ρ must be inferred from the relationship between Y and the informants' reports on T . The true predictor T_{i} is measured with error by each of J (here, 2) informants, and the resulting measurements of (or reports on) the true predictor are referred to as X_{i} _{1} ,...,X_{iJ} . Figure 1 illustrates the relationship between the outcome, the true predictor, and the multiple informants' reports, and also introduces the concept of a gold standard measurement (a measurement of the true predictor with minimal measurement error, which we refer to as G_{i} ) because it is relevant to the physical activity and BMI example.

FIGURE 1.:
Illustration of key terminology. Variables that appear in black are observed, and variables that appear in gray are unobserved. The effect of interest is the effect of T on Y . This effect can be described by β or ρ, which are the parameters of interest.

We now describe some of the approaches commonly used to estimate β or ρ from the X s and Y s. (Estimates of ρ and β will be referred to as JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png and ĉβ, respectively.) We refer to each approach either by its popular name or by a name that we have chosen to be reasonably descriptive. The approaches all amount to using a weighted average of the X s (with weights that may not necessarily sum to 1) as a proxy for T in calculating β or ρ. For method M, ĉβ thus takes the form

and JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png takes the form

where X_{i} ^{w} =w_{1} ^{M} X_{i1} +w_{2} ^{M} X_{i2} , with w_{1} ^{M} and w_{2} ^{M} being method-specific weights. There are a number of ways to determine the “best” weights for the 2 informants,^{10,11} as illustrated by the following methods.

In the single informant (SI) method, information from only one informant is used, typically because that informant is thought to provide information about T that is more valid or less often missing. Supposing without loss of generality that Informant 1 is the preferred informant, ĉβ_{SI} and JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png _{SI} can be estimated by using Equations (2) and (3) , respectively, with w_{1} ^{SI} =1 and w_{2} ^{SI} =0.

In the simple average (SA) method, information from both informants is averaged using equal weights: ĉβ_{SA} and JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png _{SA} are estimated by using Equations (2) and (3) , respectively, with w_{1} ^{SA} =w_{2} ^{SA} = 1/2. The simple average method is easy to implement and has been used in a variety of applications.^{12}

In what we refer to as the optimal weighted average (OWA) method, the weights used to calculate ĉβ_{OWA} and JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png _{OWA} minimize the sum of squared errors from the linear regression (or identically, maximize the absolute value of the correlation coefficient) for Y and a weighted average of the X s:

where JOURNAL/epide/04.03/00001648-201105000-00021/ENTITY_OV0403/v/2021-02-05T040339Z/r/image-png ^{w} and ĉβ^{w} are estimates of the intercept and slope from the linear regression of Y_{i} on (w _{1} X_{i1} + w _{2} X_{i2} ) and where w _{2} = 1 − w _{1} .

In the principal components analysis (PCA) method, the first principal component from the multiple informant data is used as the predictor. This method, along with the factor analysis method (see the Conclusions section below), is advocated by Kraemer et al for use with 3 or more multiple informants carefully selected to cover multiple perspectives and contexts.^{13} In practice, however, principal components analysis is often used with any 2 or more informants. Principal components analysis is an eigen-decomposition of the covariance (or correlation) matrix for the data, in this case the n × 2 matrix containing the information from the multiple informants. The first principal component (or “score”) is calculated by multiplying the centered data matrix by the “loadings,” the scaled eigenvector corresponding to the largest eigenvalue from the eigen-decomposition. The eigenvector is usually scaled so that its squared elements sum to 1, but here we rescale it so that its (un-squared) elements sum to 1 to preserve the scale of the predictor construct. Because applying the principal components analysis method to the correlation matrix is identical to the simple average method when there are only 2 informants, we focus on the principal components analysis method as applied to the covariance matrix. In that case, ĉβ_{PCA} and JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png _{PCA} are estimated by using Equations (2) and (3) , respectively, with w_{1} ^{PCA} = 1/(1 + v ) and w_{2} ^{PCA} = v /(1 + v ), where

with

and

. (Note that in Equations (2) and (3) , we do not center X_{i} _{1} and X_{i} _{2} before multiplying them by their respective weights, as is usually done in principal components analysis, because doing so has no effect on ĉβ_{PCA} and JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png _{PCA} .)

The classical measurement error (CME) method is used not only to handle errors in predictors in environmental and nutritional epidemiology and in econometrics, but also has roots in classical test theory for psychometric tests (see Bollen^{14} ). The method is the same as the simple average method, except that the resulting slope is then divided by an “attenuation factor” (here, λ^{CME} ) to correct for attenuation due to random measurement error. The attenuation factor is derived under the “classical measurement model” for the relationship between the informants' reports and the true predictor, which assumes that the informants' reports equal the true predictor plus an additive random measurement error that has constant variance, σ^{2} , across informants (see next section for more on the classical measurement model). The formula for the attenuation factor is as follows:

(see Carroll et al^{9} for the derivation). The parameters Var(T_{i} ) and σ^{2} are unknown and, if only one attenuated measure of T_{i} were available, would have to be determined based on external information from theory or previous studies. However, because we have more than one measurement of T_{i} , we can estimate λ^{CME} using the following equation:

where

and

.^{15} Then, ĉβ_{CME} can be estimated by using Equation (2) with

, and

In situations that violate the assumptions of the classical measurement model, ĉλ^{CME} can be very small (in which case JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png _{CME} may be >1 in magnitude) or can even be negative (in which case the classical measurement error method fails because ĉβ_{CME} and JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png _{CME} should not and cannot, respectively, be calculated).

It is possible to analytically compare some, but not all, of the aforementioned estimators for β. For example, it is easy to show that

(see eAppendix 1A [https://links.lww.com/EDE/A468 ] for a proof).

SIMULATION EXPERIMENTS
Setup
We conducted simulation experiments to compare how the 5 approaches perform in terms of bias and mean squared error (MSE) for multiple informant data generated from a rounded version of the congeneric measurement model first described by Joreskog.^{16} This model for the relationship between the multiple informants' reports and the true predictor takes the form:

where i indexes the individuals being studied, j indexes the informant types (eg, mother = 1, child = 2), and ε_{ij} is a random measurement error that comes from a distribution with mean 0 and variance σ_{j} ^{2} . The parameter σ_{j} ^{2} reflects the variation in measurement reliability across informant types. In contrast, the parameters δ_{j} and λ_{j} reflect the variation in measurement validity across informant types, with δ_{j} and λ_{j} representing, respectively, additive and multiplicative systematic measurement bias. (Note that λ_{j} reflects only multiplicative bias because T_{i} is centered by its mean, ET_{i} , before multiplying by λ_{j} .) The model in (7) is similar to other models that have been proposed for multiple informant data in the context of psychiatric research^{13} and marketing research,^{17,18} except that the above model takes the form of a unidimensional rather than multidimensional factor analysis model and also involves rounding to reflect our focus on a true predictor that is a mixture of zeros and continuously distributed positive values. Finally, note that the “classical measurement model” is a variant of the model in Equation (7) in which there is no rounding and where δ_{1} = δ_{2} = 0, λ_{1} = λ_{2} = 1, and σ_{1} ^{2} =σ_{2} ^{2} =σ^{2} .

In our simulation experiments, we investigated how the bias and MSE of the approaches described earlier vary with different values of the parameters in the model in (7): −0.50, −0.25, 0, 0.25, or 0.50 for δ_{1} and δ_{2} ; 0.40, 0.75, 1, 1.10, or 1.33 for λ_{1} and λ_{2} ; and 0.0, 0.1, or 0.5 for σ_{1} ^{2} and σ_{2} ^{2} . We chose these values either because they were realistic based on actual applications^{19} of previously proposed models^{17,18} for multiple informant data or because they could aid us in understanding how the other parameters affected results (eg, σ_{1} ^{2} = 0). We also made the following assumptions about the variables in the measurement model (Equation 7 ) and regression model (Equation 1 ): Cor(T_{i} ,ε_{ij} ) = 0; Cor(ε_{i1} ,ε_{i2} ) = 0; and Cor(ε_{ij,} ε_{i} ) = 0 (referred to as nondifferential measurement error). Further, we assumed that α = 1, β = 1, and σ_{ε} ^{2} = 1, with the value of β based on the physical activity and BMI example. We generated 1000 datasets of size n = 250 for every combination of the above parameter values; then used the single informant method (arbitrarily based on informant 1) and the simple average, optimal weighted average, principal components analysis, and classical measurement error methods (by definition, based on both informants) to calculate ĉβ and JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png for each dataset; and finally used the resulting estimates and true values of β (= 1) or ρ (= 0.5) to calculate the average bias and average MSE. Readers interested in the details of the simulation experiments can consult eAppendix 1B (https://links.lww.com/EDE/A468 ), which describes the simulation experiments more fully, and can download the code for the simulation experiments (eAppendix 2, https://links.lww.com/EDE/A469 ).

Results
Tables 1 and 2 summarize how changes in the measurement model parameters affect the average percent bias in JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png and ĉβ, respectively, for each method. In addition, readers interested in seeing the method-specific average bias and MSE for each combinations of measurement model parameters can view the figures in eAppendix 1B (https://links.lww.com/EDE/A468 ) or download the results from the simulation experiment (eAppendix 3, https://links.lww.com/EDE/A471 ).

TABLE 1: Effect of Rounded Congeneric Measurement Model Parameters on Average Percent Bias in JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png

TABLE 2: Effect of Rounded Congeneric Measurement Model Parameters on Average Percent Bias in ĉβ

Regarding ρ, the single informant, simple average, optimal weighted average, and principal components analysis estimates follow similar patterns and are often very similar in value. They are never exaggerated (biased away from 0), but in some situations they are attenuated (biased toward 0). This attenuation bias becomes worse as σ_{1} ^{2} (and σ_{2} ^{2} , for the simple average, optimal weighted average, and principal components analysis methods) increases, and as λ_{1} (and λ_{2} , for the simple average, optimal weighted average, and principal components analysis methods) becomes increasingly less than 1. In contrast, the classical measurement error estimates have at most small (<10%) attenuation bias, but can become very exaggerated (in some cases, to the point where they are greater than 1) and can even fail. The exaggeration bias becomes worse as δ_{1} and δ_{2} become increasingly discrepant, and as λ_{1} and λ_{2} become increasingly discrepant or become increasingly less than 1. As for comparisons between methods, the optimal weighted average method performs better than (in terms of a smaller magnitude of bias of JOURNAL/epide/04.03/00001648-201105000-00021/OV0389/v/2021-02-05T040339Z/r/image-png ) or at least as well as the single informant, simple average, and principal components analysis methods for all combinations of parameters. Optimal weighted average also performs better than classical measurement error for some combinations of parameters, especially when δ_{1} and δ_{2} differ considerably or λ_{1} and λ_{2} differ considerably. However, classical measurement error performs considerably better than optimal weighted average (and, by extension, the single informant, simple average, and principal components analysis methods) for other combinations of parameters, especially when the difference between δ_{1} and δ_{2} is small and both σ_{1} ^{2} and σ_{2} ^{2} are large.

Turning to β, the estimates from all 5 methods generally have a greater magnitude of bias than the corresponding estimates of ρ. The single informant, simple average, optimal weighted average, and principal components analysis estimates of β again follow a similar pattern but, unlike the corresponding estimates of ρ, can be either exaggerated or attenuated. The exaggeration bias becomes worse as δ_{1} (and δ_{2} , for the simple average, optimal weighted average, and principal components analysis methods) becomes increasingly less than 0, and as λ_{1} (and λ_{2} , for the simple average, optimal weighted average, and principal components analysis methods) becomes increasingly less than 1. The attenuation bias becomes worse as λ_{1} (and λ_{2} , for the simple average, optimal weighted average, and principal components analysis, methods) becomes increasingly more than 1, and as σ_{1} ^{2} (and σ_{2} ^{2} , for the simple average, optimal weighted average, and principal components analysis methods) increases. The classical measurement error estimates of β, like the corresponding estimates of ρ, have at most moderate attenuation bias (<25%), but can become very exaggerated or even fail. The exaggeration bias becomes worse as δ_{1} and δ_{2} become increasingly discrepant or increasingly less than 0, as λ_{1} and λ_{2} become increasingly less than 1, and as σ_{1} ^{2} and σ_{2} ^{2} increase. The attenuation bias becomes worse as λ_{1} and λ_{2} become increasingly more than 1. As for comparisons between methods, each method outperforms all of the other methods for some combinations of parameters, as can be seen in Table 3 .

TABLE 3: Best-Performing Method(s) for Various Combinations of Rounded Congeneric Measurement Model Parameters

VIGOROUS PHYSICAL ACTIVITY AND BMI EXAMPLE
To illustrate the 5 approaches, we consider a validation study^{20} used to design a larger study^{7} of the relationship between physical activity and obesity in children in 2 towns of Mexico City in 1996. In the validation study, which included 114 ten- to fourteen-year-old students, 2 informants (child and mother) completed a questionnaire designed to assess the child's physical activity and inactivity. Physical activity was also assessed by 24-hour recall, which was provided by the child in an interview with a trained nutritionist on 3 separate days; the 3 measurements of physical activity were averaged, with different weights for weekend days versus weekdays. Here, we focus on hours of daily vigorous physical activity (T_{i} ) as a predictor for body mass index (Y_{i} ). We treat the weighted average of the 24-hour recall measurements of vigorous physical activity as the gold standard measure of vigorous physical activity (G_{i} ). We treat the mother reports of vigorous physical activity as X_{i} _{1} and the child reports of vigorous physical activity as X_{i} _{2} , and we use them to estimate β and ρ. For the sake of simplicity, we do not include any of the available covariates (eg, sex of the child, age, etc) in our analyses.

We focus on cases with complete data (n = 81). Figure 2 displays frequency plots for 24-hour recall and mother and child report of vigorous physical activity, as well as scatterplots of BMI versus those measures. For 24-hour recall, mother report, and child report, the means (variances) are 0.48 (0.40), 0.80 (0.46), and 1.00 (0.91), respectively, and the percentage of values that are 0 are 43%, 9%, and 1%, respectively. Both mother and child, and especially the child, overestimate vigorous physical activity relative to 24-hour recall. The Pearson correlations of the mother and child report with 24-hour recall are 0.41 and 0.24, respectively, and the Pearson correlation of the mother and child report with each other is 0.23. These numbers suggest that the mother report may better capture the relationship between vigorous physical activity and BMI.

FIGURE 2.:
Frequency and scatter plots (vs. body mass index [BMI]) for the 24-hour recall gold standard measurement and mother and child reports of vigorous physical activity.

Table 4 displays estimates of ρ and β from all 5 methods applied to the mother and child reports of vigorous physical activity, as well as estimates of ρ and β based on 24-hour recall of vigorous physical activity, the gold standard measurement. The estimates reveal that the optimal weighted average method (which gives weights of approximately 3/4 and 1/4 to the mother report and child report, respectively) produces estimates identical to the gold standard estimates of both ρ and β. The single informant, simple average, and principal components analysis methods all underestimate both ρ and β; the next-closest estimates to the gold standard estimates are given by the simple average method (which, of course, weights both reports equally), followed by the single informant method based on mother report only, then the principal components analysis method (which gives weights of approximately 1/4 and 3/4 to the mother report and child report, respectively), and then the single informant method based on child report only. Last, the classical measurement error method overestimates both ρ and β by a considerable amount. These results show that the mother report is a better proxy for 24-hour recall of vigorous physical activity in terms of ability to predict BMI, but that combining mother report with child report (via the optimal weighted average or simple average methods) produces better estimates than using mother report alone.

TABLE 4: Results for Vigorous Physical Activity as a Predictor of Body Mass Index

CONCLUSIONS
We have described 5 methods of estimating the regression or correlation coefficient for the effect of a predictor on a continuous outcome, when that predictor cannot be observed directly but is reported on by 2 informants. We then compared the performance of these methods in situations where the true predictor is a mixture of zeros and continuously distributed positive values.

Regarding the correlation coefficient, the simulation experiments suggest that estimates obtained by the single informant, simple average, optimal weighted average, and principal components analysis methods become attenuated when random error is present—unsurprisingly given that these methods do not correct for attenuation due to random measurement error. However, these estimates are relatively unaffected by additive or multiplicative measurement bias on their own. (Without the rounding in the congeneric measurement model, the estimates would be completely unaffected by additive or multiplicative bias.) In contrast, the classical measurement error method assumes that the measurements contain random error, but not additive or multiplicative bias. Unsurprisingly, then, the classical measurement error estimates become exaggerated (or even fail) when the additive or multiple bias differs between informants, because the second term in the numerator of λ^{CĉME} becomes too large. Also unsurprisingly, then, the classical measurement error estimates are unaffected by random error on its own since the method corrects for attenuation due to random error. In terms of comparisons between methods, the optimal weighted average estimates are never more biased than the single informant, simple average, and principal components analysis estimates, which is to be expected given that the optimal weighted average estimates weight the 2 informants optimally. For instance, in the physical activity and BMI example, the optimal weighted average estimates give more weight to the better informant (the mother), in contrast to the simple average estimates, which weights both informants equally, and the principal components analysis estimates, which give more weight to the worse informant (the child) because it has more variance. The single informant estimate (based on mother) gives all the weight to the better informant, but suffers more from attenuation due to random measurement error because it is based on only one informant. Finally, the optimal weighted average estimates can be more biased than the classical measurement error estimates when the random error is large and the additive and multiplicative bias do not differ greatly between informants.

Estimates of the regression coefficients are typically more biased than corresponding estimates of the correlation coefficients. The effects of random measurement error and additive and multiplicative measurement bias on the estimates are similar to those described in the preceding paragraph, with 2 exceptions. First, for all the methods, upward or downward multiplicative bias results in attenuated or exaggerated estimates, respectively, because multiplicative bias changes the scale of the predictor. Second, for the classical measurement error method, random error on its own results in small-to-moderate exaggeration in the estimates of the regression coefficients, which is due to the rounding in the congeneric measurement error model. (Without the rounding, the classical measurement error estimates would be unbiased.) In terms of comparisons among methods, each method performs best in some situations.

Overall, when there are only 2 informants, the simple average method is a reasonable choice. The simple average method, although not always optimal, rarely performs much worse than the single informant, optimal weighted average, and principal components analysis methods, and often performs similarly to the most optimal of those methods, as in the physical activity and BMI example and as found in previous studies.^{21} Further, with the simple average method, it is straightforward to compare results across samples because this method uses the same (equal) weights for informants in every sample. Although the classical measurement error method performs better than the other methods (including the simple average method) when there is a large amount of random error and little difference in additive or multiplicative bias between informants, it should be avoided in other situations because the estimate can be very exaggerated (as in the physical activity and BMI example).

Of course, it is preferable to have more than 2 informants, or validation data, because this allows the use of more sophisticated methods such as latent variable models (eg, factor analysis) or extensions of the classical measurement error method that allow for measurement bias or correlated measurement errors (see Cheng and Van Ness^{15} ).

ACKNOWLEDGMENTS
We thank Bernardo Hernández for generously allowing the use of his data on vigorous physical activity and body mass index, and Matt Vendlinski and two anonymous reviewers for helpful comments on earlier drafts of this manuscript.

REFERENCES
1. Achenbach TM, McConaughy SH, Howell CT. Child/adolescent behavioral and emotional problems: implications of cross-informant correlations for situational specificity.

Psychol Bull . 1987;101:213–232.

2. Achenbach TM, Krukowski RA, Dumenci L, Ivanova MY. Assessment of adult psychopathology: meta-analyses and implications of cross-informant correlations.

Psychol Bull . 2005;131:361–382.

3. Fitzmaurice GM, Laird NM, Zahner GE, Daskalakis C. Bivariate logistic regression analysis of childhood psychopathology ratings using multiple informants.

Am J Epidemiol . 1995;142:1194–1203.

4. Kuo M, Mohler B, Raudenbush SL, Earls FJ. Assessing exposure to violence using multiple informants: application of hierarchical linear model.

J Child Psychol Psychiatry . 2000;41:1049–1056.

5. Goldwasser MA, Fitzmaurice GM. Multivariate linear regression analysis of childhood psychopathology using multiple informant data.

Int J Methods Psychiatr Res . 2001;10:1–10.

6. Horton NJ, Laird NM, Zahner GE. Use of multiple informant data as a predictor in psychiatric epidemiology.

Int J Methods Psychiatr Res . 1999;8:6–18.

7. Hernández B, Gortmaker SL, Colditz GA, Peterson KE, Laird NM, Parra-Cabrera S. Association of obesity with physical activity, television programs and other forms of video viewing among children in Mexico City.

Int J Obes Relat Metab Disord . 1999;23:845–854.

8. O'Malley JA, Landon BE, Guadagnoli E. Analyzing multiple informant data from an evaluation of the Health Disparities Collaboratives.

Health Serv Res . 2007;42:146–164.

9. Carroll RJ, Ruppert D, Stefanski LA.

Measurement Error in Nonlinear Models. London: Chapman & Hall/CRC; 1995.

10. Piacentini JC, Cohen P, Cohen J. Combining discrepant diagnostic information from multiple sources: Are complex algorithms better than simple ones?

J Abnorm Child Psychol . 1992;20:51–63.

11. van Bruggen GH, Lilien GL, Kacker M. Informants in organizational marketing research: Why use multiple informants and how to aggregate responses.

J Mark Res . 2002;39:469–478.

12. Allen JP, Kuperminc G, Philliber S, Herre K. Programmatic prevention of adolescent problem behaviors: The role of autonomy, relatedness, and volunteer service in the Teen Outreach Program.

Am J Community Psychol . 2005;22:617–638.

13. Kraemer HC, Measelle JR, Ablow JC, Essex MJ, Boyce WT, Kupfer DJ. A new approach to integrating data from multiple informants in psychiatric assessment and research: mixing and matching contexts and perspectives.

Am J Psychiatry . 2003;160:1566–1577.

14. Bollen KA.

Structural Equations With Latent Variables. New York: John Wiley & Sons; 1989.

15. Cheng CL, Van Ness J.

Statistical Regression With Measurement Error. London: Arnold Publishers; 1999.

16. Jöreskog KG. Statistical analysis of sets of congeneric test.

Psychometrika . 1971;36:109–133.

17. Anderson JC. A measurement model to assess measurement-specific factors in multiple-informant research.

J Mark Res . 1985;22:86–92.

18. Phillips LW. Assessing measurement error in key informant reports: A methodological note on organizational analysis in marketing.

J Mark Res . 1981;18:395–415.

19. Kumar A, Dillon WR. On the use of confirmatory measurement models in the analysis of multiple informant reports.

J Mark Res . 1990;27:102–111.

20. Hernández B, Gortmaker SL, Laird NM, Colditz GA, Parra-Cabrera S, Peterson KE. Validity and reproducibility of a physical activity and inactivity questionnaire for Mexico City's schoolchildren.

Salud Publica Mex . 2000;42:315–323.

21. Schmidt FL. The relative efficiency of regression and simple unit predictor weights in applied differential psychology.

Educ Psychol Meas . 1971;31:699–714.