# On a Square-Root Transformation of the Odds Ratio for a Common Outcome

VanderWeele, Tyler J.

doi: 10.1097/EDE.0000000000000733
Letters

Supplemental Digital Content is available in the text.

Department of Epidemiology, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, tvanderw@hsph.harvard.edu

## To the Editor:

It is well known that for a common outcome, the magnitude of an odds ratio (OR) relating an exposure to that outcome can substantially exceed the corresponding risk ratio (RR) when, for example, analyzing cohort data. When an outcome is rare (10% is often used as a cutoff) the OR closely approximates the RR and is often interpreted as an RR. But when the outcome is common, if the OR is interpreted as a RR, it can vastly exaggerate the RR and is never optimal to use. Because logistic regression is often the tool of choice for multivariate control, the reporting of ORs, even when the outcome is common, is routine. Although numerous methods have been developed to estimate RRs for a common outcome while still allowing for covariate control,1,2 these methods continue to be used infrequently.2 The practice reporting of ORs for common outcomes remains frequent in the biomedical literature. The intuitive understanding of the magnitude of the OR in such settings is more difficult. This letter proposes a simple transformation of an OR for a common outcome that, in the vast majority of settings, yields a quantity that is far closer to the RR. The purpose of this letter is not to suggest that the methods for estimating RRs for common outcomes should not be used; rather, it is intended to assist in the interpretation of ORs for common outcomes when they are in fact reported in papers.

The proposed transformation is a simple one: it is to simply take the square root of the OR estimate. Thus, as a somewhat better approximation to the RR, an OR of 2 becomes 1.41, an OR of 4 becomes 2, an OR of 9 becomes 3, and so on. I will provide brief motivation for this transformation and then discuss some properties related to its performance as a quantity that more closely approximates the RR.

First, consider a setting in which the outcome probability for the exposed is some quantity w above 0.5 and the outcome probability for the unexposed is that same quantity w below 0.5 so that the probability for the exposed and unexposed are

and

, respectively. We then have

and OR =

. In this case, the OR is exactly the square of the RR and taking the square root recovers the RR. It turns out that this same transformation works surprisingly well for most values of the outcome probabilities when the outcome is common.

Let us begin with a causative exposure so that

. Suppose first that both

and

are between 0.2 and 0.8. In this case, the OR can be inflated by a factor as large as 400% (e.g., with

,

, we have RR = 4 but OR = 16); however, it can be shown (see the eAppendix; http://links.lww.com/EDE/B255 for mathematical proofs of all claims) that the most sqrt(OR) can be inflated above RR is by a factor of 25% (e.g., with

,

, we have RR = 1.6 and sqrt(OR) = 2). With outcomes probabilities

and

between 0.2 and 0.8, the square root of the OR will be at most 25% away from the RR.

and

are between 0.1 and 0.9, the OR can be inflated by a factor as large as 900% (e.g., with

,

, we have RR = 9 but OR = 81), but the square root of the odds can be inflated at most by a factor of 67% for the RR (e.g., with

,

, we have RR = 1.8 and sqrt(OR) = 3). The square root transformation reduces the inflation dramatically, and as above, when the risk for exposed and unexposed average to 0.5, the transformation negates the bias exactly. More substantial inflation can occur when the outcome probabilities exceed 0.9, but the square root transformation will still provide an improvement as an approximation to the RR.

The square root transformation will in fact always deflate the OR toward the RR. It can in some circumstances overdeflate so that sqrt(OR) is less than RR (e.g., with

,

, RR = 1.67, and sqrt(OR) = 1.52) but once again with

and

between 0.2 and 0.8, the maximum deflation will be by a factor of 1/1.25-fold (i.e., a 20% reduction) and with

and

between 0.1 and 0.9 and the maximum deflation will be a factor of 1/1.67-fold, that is, a 40% reduction. Even in these circumstances in which the sqrt(OR) is deflated beyond the RR, the factor by which sqrt(OR) is deflated beyond the RR will, in the vast majority of settings, be smaller than the factor by which OR is inflated above RR. The values of the outcome probabilities for which this is so when both probabilities are above 0.1 is plotted in the Figure as the black area. When both outcomes probabilities are above 0.1, the factor of inflation for the OR exceeds the factor of deflation for the sqrt(OR) for about 93% of possible outcome probabilities. When both probabilities are above 0.2, this is so for 99% of the possible outcome probabilities. When both probabilities are above 0.25, it is always the case. Analogous statements to all claims above also hold for protective exposures with p 1 < p 0.

Ratio scales are sometimes converted into excess relative risk measures for the purposes of obtaining measures of public health significance.3,4 For these purposes, it is not the ratio of the RR to the OR or sqrt(OR) that matters, but the differences between these quantities. Once again, the square root transformation is superior in the vast majority of settings. It always deflates the OR towards the RR; it can sometimes overdeflate, but, even then, in the vast majority of cases, the absolute difference |sqrt(OR) âˆ’ RR| is smaller than the absolute difference |OR âˆ’ RR|. For causative exposures, when both probabilities are between 0.2 and 0.8, the absolute difference for OR can be as large as 12, but for sqrt(OR), only as large as 0.55; when both outcome probabilities are between 0.1 and 0.9, the absolute difference for OR can be as large as 72, but only as large as 2.43 for sqrt(OR). For causative exposures, the square root transformation has a smaller absolute difference 95% of the time if both outcomes probabilities are above 0.1 and 99% of the time if both outcome probabilities are above 0.2. For protective exposures, with p 1 < p 0, the square root transformation has a smaller absolute difference 90% of the time if both outcomes probabilities are above 0.1 and 98% of the time if both outcome probabilities are above 0.2.

Again, the square transformation is much closer to the RR in almost all scenarios and provides a somewhat reasonable approximation to the RR. As a rule of thumb, one might suggest that when the prevalence of the outcome is above 20%, the square root approximation is preferable. The transformation may, thus, be of use with randomized trial, cohort, or cross-sectional data or with caseâ€“control data with cumulative sampling. Caseâ€“control studies with incidence density sampling, however, provide a direct estimate of the incidence rate ratio,3 and further discussion of rate ratios and proportional hazards models is given in the eAppendix; http://links.lww.com/EDE/B255. The transformation proposed here may also be of interest in the interpretation of the results of meta-analyses. In meta-analyses, approximate conversions are typically made between standardized effect sizes and log ORs.5,6 The approximations employed effectively assume common outcome probabilities and do not perform well when the outcome probabilities are very small or very large.7 The conversions that are used in meta-analyses are, thus, applicable precisely when the outcome is common and effectively deliver ORs assuming a common outcome; conversion of these to approximate RRs could once again be obtained by applying the square-root transformation. Again, the purpose of this letter is not displace methods that estimate RRs for common outcomes but rather to aid the interpretation of OR estimates for common outcomes already reported in the literature.

Tyler J. VanderWeele

Department of Epidemiology

Harvard T.H. Chan School of Public Health

Harvard University

Boston, MA

tvanderw@hsph.harvard.edu

## REFERENCES

1. Knol MJ, Le Cessie S, Algra A, Vandenbroucke JP, Groenwold RH. Overestimation of risk ratios by odds ratios in trials and cohort studies: alternatives to logistic regression. CMAJ. 2012;184:895â€“899.
2. Yelland LN, Salter AB, Ryan P. Relative risk estimation in randomized controlled trials: a comparison of methods for independent observations. Int J Biostat, 2011;7(1).
3. Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 2008.3rd ed. Philadelphia: Lippincott.
4. VanderWeele TJ. Explanation in Causal Inference: Methods for Mediation and Interaction. 2015.New York: Oxford University Press.
5. Borenstein M, Hedges LV, Higgins JPT, Rothstein HR. Chapter 7 in Introduction to Meta-Analysis. 2009.1st ed. Hoboken, NJ: Wiley.
6. Hasselblad V, Hedges LV. Meta-analysis of screening and diagnostic tests. Psychol Bull. 1995;117:167â€“178.
7. Anzures-Cabrera J, Sarpatwari A, Higgins JP. Expressing findings from meta-analyses of continuous outcomes in terms of risks. Stat Med. 2011;30:2967â€“2985.