To the Editor:
We were intrigued by the inconclusive results presented in the recent article by Rinaldo et al.1 The authors investigated the association between the body mass index (BMI) and the risks of delayed cerebral ischemia (DCI), delayed cerebral infarction, and poor functional outcome. They found that although increased BMI can act protectively against DCI and infarction, it has no influence on the outcome.
At a closer look, some apparent contradictions could be resolved, but further questions pop up. We also have doubts regarding statistical methodology.
First, there is a discrepancy in the P-value concerning the influence of the increased BMI on delayed infarction: In the text, the value of .013 is given, while in Table 1 the value of .020 is specified. Based on the descriptive statistics given in the paper, the latter seems more plausible (chi-squared test without Yates's continuity correction). We’d like to note that for such a small number of positive examples (2 observed and 6.24 expected infarctions for high BMI) the chi-squared test without correction is likely to be too optimistic, underestimating the true P-value. Applying Yates's correction leads to the P-value of .04, but that is probably too conservative. Fisher's exact test might have been a better choice in this case.
Considering that the BMI is the primary subject of investigation in the article, we find it surprising that the authors have chosen to dichotomize it, thus losing much information contained in the variable. Is there a particular reason why the BMI was used as a continuous-valued variable in the linear regression, but not in the logistic regression?
We would also like to point out that the authors performed 26 tests (13 potential predictors for DCI and the same 13 for infarction) in the univariate logistic regression analysis. With such a large number of tests, there is a high likelihood of making a type I error (94% if the test had been independent), thus some kind of error rate control would have been warranted. Put differently, had the authors set the significance level for the univariate analysis to 0.1 for a single test and corrected it for the number of tests, the significance level would have dropped well below 0.01. Had a conservative correction been used, like Holm or Bonferroni, only maximum velocity on transcranial Doppler (TCD) and intracerebral hemorrhage would have passed the screening.
It is known that BMI is a major risk factor for hypertension.2 In other words, the 2 variables share common information. We would therefore expect the P-value for the former (.017 and .013 for DCI and infarction, respectively) not to be so different from the P-value for the latter (.73 and .25). We think this unexpected result should have been addressed in the article.
The univariate analysis, as performed by the authors, led to 6 variables as potentially good predictors (P < .1) to be further investigated. The authors used multivariate analysis with stepwise elimination of irrelevant variables. For DCI, the analysis identified BMI and maximum velocity on TCD as the only significant (P < .05) predictors. Here, again, no correction for multiple testing was applied. Considering that the P-value for BMI (.03) is relatively close to the chosen significance level, this implies that the apparent significance of BMI could have been due to statistical noise in the data. From that perspective, the lack of association between BMI and poor outcome is less surprising. For the delayed cerebral infarction, the BMI’s P-value (.008) seems to be robust against effects of multiple testing, but, due to above mentioned issues in the univariate analysis, it is questionable whether it should have reached the multivariate analysis at all. It is known that screening procedures are prone to misidentifying significant variables if no correction is applied.3
Furthermore, to interpret the results correctly and without bias, the modeling methodology and presentation could be more accurate.4 The present manuscript does not provide an overall performance measure. The concordance (or c) statistic is a measure for discriminative ability.5 To assess goodness of fit, a different procedure—eg, the Hosmer–Lemeshow test6—should be used. Also, the model results are likely overestimated since no correction for optimism is performed. In other words, the model is likely to perform worse on the population than on the sample used for its generation. To create more stable model results, internal validation, eg, by bootstrapping, is highly recommended.7
To summarize: Based on the published data and their statistical analysis, we find that an association between BMI on the one hand and DCI and delayed cerebral infarction on the other was not sufficiently shown.
The authors have no personal, financial, or institutional interest in any of the drugs, materials, or devices described in this article.
1. Rinaldo L, Rabinstein AA, Lanzino G. Increased body mass index associated with reduced risk of delayed cerebral ischemia and subsequent infarction after aneurysmal subarachnoid hemorrhage. Neurosurgery. published online ahead of print: April 4, 2018. (doi: 10.1093/neuros/nyy104).
2. Brown CD, Higgins M, Donato KA, et al. Body mass index and the prevalence of hypertension and dyslipidemia. Obes Res. 2000;8(9):605–619.
3. Freedman D. A note on screening regression equations. Am Stat. 1983;37(2):152–155.
4. Steyerberg EW. Clinical Prediction Models: a Practical Approach to Development, Validation, and Updating. New York (NY): Springer; 2009.
5. Harrel FE Jr, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Sta. Med.1984;3(2):143–152.
6. Hosmer DW, Lemeshow S. Applied Logistic Regression. 2nd ed.New York (NY):Wiley; 2000.
7. Steyerberg EW, Harrel FE, Borsboom GJJM, Eijkemans MJC, Vergouwe Y, Habbema JDF. Internal validation of predictive models. J. Clin. Epidemiol. 2001;54(8):774–781.