Secondary Logo

Journal Logo

Letters

A Simple, Interpretable Conversion from Pearson’s Correlation to Cohen’s for d Continuous Exposures

Mathur, Maya B.; VanderWeele, Tyler J.

Author Information
doi: 10.1097/EDE.0000000000001105

To the Editor:

Meta-analysts often must convert effect sizes reported on different scales to a common scale for analysis.1 In particular, it is common to convert Pearson’s correlation,

, computed between an exposure

and an outcome

to Cohen’s

(also called the “standardized mean difference”), which is the difference in expected

for a fixed contrast in

, standardized by the standard deviation of

conditional on

. Letting

denote the total sample size, the standard conversion1 from

to

is:

An important, yet infrequently discussed, point is that this conversion was derived for a Pearson correlation computed between a binary exposure

and a continuous outcome

, also called a “point-biserial” correlation.2–4 Note that when

represents a dichotomization of a truly continuous underlying exposure, a special approach3 is required to estimate the correlation between the underlying, continuous exposure and

; one cannot simply apply the standard Pearson’s correlation formula to the observed, dichotomized

. Stated otherwise, the point-biserial correlation does not consistently estimate the Pearson correlation that would have been obtained using the underlying continuous variable.

Despite the standard conversion’s origins in the binary-exposure setting, meta-analysts in practice often unknowingly apply Equation (1.1) to obtain Cohen’s

from correlations and regression results computed using a continuous

. In fact, a widely referenced textbook on meta-analysis describes Equation (1.1) without stipulating that it is only known to apply for the point-biserial case.1 Even if Equation (1.1) can be used for correlations computed using a continuous

, its interpretation is unclear: that is, the interpretation of Cohen’s

depends on the choice of “groups” in

whose means are compared, but because Equation (1.1) applies for a correlation in which

is already binary, it is not clear which “groups” of

are created when the conversion is instead applied to a correlation using a continuous

.

To allow direct computation of Cohen’s

from Pearson’s

or simple linear regression, we provide a similar conversion and approximate standard error that apply when

is continuous. The resulting effect size represents the average increase in the standardized

associated with an increase in

of

units. To preserve the sign of the effect size,

should be set to be positive regardless of the sign of

. Letting

denote the sample standard deviation of

, the conversion is:

Derivations of these estimates of

and its standard error are provided in the eAppendix, http://links.lww.com/EDE/B601. If r = 0 exactly, the standard error estimate is undefined, so could be replaced by $\lim_{r \to 0} \widehat{\text{SE}}(d)$, which is provided in the eAppendix. The standard error estimate assumes that

is approximately normal and that

is large. If the standard deviation of

is known rather than estimated, then the term

should be omitted. As a potential practical limitation, some papers to be meta-analyzed might not report

, in which case the meta-analyst might need to substitute an estimate from, for example, a comparable second study or a subsample of the study used to estimate

. In this case, the

in the term

should be replaced with the size of the second sample used to estimate

(see Supplement, http://links.lww.com/EDE/B601). The conversion is easy to calculate manually or using the function r_to_d in the R package MetaUtility.

Comparing Equations (1.1) and (1.2) clarifies the meaning of the “Cohen’s

” that results from unknowingly applying Equation (1.1) to a correlation computed with a continuous

. Specifically, the result coincides with the effect size associated with an increase in

of two standard deviations. (However, even with

, the standard error estimates in Equations (1.1) and (1.2) will, in general, still not coincide.) In many applications, this may represent a rather extreme contrast: for example, if

is normal, then a two-standard-deviation contrast with the reference level set to the mean would involve comparison to the 97.7th quantile of

. Alternatively, a two-standard deviation contrast from one standard deviation below the mean to one standard deviation above is a comparison of the 15.8th quantile to the 81.1th quantile of

. Additionally, the absolute size of a two-standard-deviation contrast in

may differ substantially across study populations and may therefore be challenging to interpret in practice.5 Thus, it is perhaps preferable, when possible, to instead fix a specific, scientifically meaningful contrast of interest,

, which is held constant across all meta-analyzed studies, and then to apply the proposed conversion in Equation (1.2). The meta-analytic pooled estimate would then correspond to a well-defined contrast in

of

units, rather than to a contrast whose size may vary arbitrarily across studies.

The standard conversion is alternatively sometimes described in terms of the contrast that arises from dichotomizing

at a given threshold,1 yet in fact, the conversion often substantially overestimates the contrast produced by dichotomization, even at extreme thresholds of

. For example, we simulated bivariate normal data (

observations) where

and

, such that

. The standard conversion estimates

(Figure, dashed red line). We also calculated the true two-group Cohen’s

arising from dichotomizing

at various thresholds in

(Figure, solid black curve). The Figure shows that the “Cohen’s

” from the standard conversion is 47% larger than the true two-group

arising from dichotomization at the mean and still overestimates the true two-group

for extreme dichotomization thresholds near

or

. For example, for dichotomization at

(i.e., the 97.7th percentile), the standard conversion still overestimates the true two-group Cohen’s

by 12%.

FIGURE
FIGURE:
True two-group Cohen’s
corresponding to dichotomizing a standard normal X at varying points (solid black curve) versus “Cohen’s
“ calculated from the standard conversion in Equation (1.1) (dashed red line).

In summary, when approximating Cohen’s

from Pearson’s

or simple linear regression with a continuous

, we caution against using conversions derived for a binary

. We provide a straightforward conversion designed to accommodate the case of a continuous

through specification of a fixed contrast in

; we believe its use in meta-analysis would enable more precisely interpretable and scientifically meaningful effect sizes.

ACKNOWLEDGMENTS

An anonymous contributor on a statistics forum made helpful contributions to the proof of Lemma 2.1 (Supplement).

Maya B. Mathur

Department of Epidemiology

Harvard T. H. Chan School of Public Health

Boston, MA

Quantitative Sciences Unit

Stanford University

Palo Alto, CA

mmathur@stanford.edu

Tyler J. VanderWeele

Department of Epidemiology

Harvard T. H. Chan School of Public Health

Boston, MA

REFERENCES

1. Borenstein M, Hedges LV, Higgins J, Rothstein HR. Introduction to Meta-Analysis. 2009.Hoboken, NJ. Wiley Online Library;
2. McGrath RE, Meyer GJ. When effect sizes disagree: the case of r and d. Psychol Methods. 2006;11:386.
3. Jacobs P, Viechtbauer W. Estimation of the biserial correlation and its sampling variance for use in meta-analysis. Res Synth Methods. 2017;8:161–180.
4. Hunter JE, Schmidt FL. Methods of Meta-Analysis: Correcting Error and Bias in Research Findings. 2004.Newbury Park, CA. Sage;
5. Greenland S, Schlesselman JJ, Criqui MH. The fallacy of employing standardized regression coefficients and correlations as measures of effect. Am J Epidemiol. 1986;123:203–208.

Supplemental Digital Content

Copyright © 2020 Wolters Kluwer Health, Inc. All rights reserved.