Secondary Logo

Journal Logo

The Editors' Notepad

The goal of this blog is to help EPIDEMIOLOGY authors produce papers that clearly and effectively communicate their science.

Tuesday, August 30, 2016

How to write an effective Results section

The typical outcomes paper in epidemiology usually involves a lot of numbers – multiple exposures and measures of exposures, subgroup analyses, and alternative modeling strategies. The standard of practice when making statistical comparisons is to place an effect estimate within a confidence interval, rather than using a p-value (Epidemiology generally only allows p-values for tests of trend or heterogeneity, and even then strongly discourages comparison with a Type 1 error rate). Outcomes papers thus tend to have three or four tables of data, often with more online, each with up to a dozen columns, but organized in intuitive, digestible, easy-to-follow chunks. If figures are possible, so much the better.

Writing the text of the Results section to summarize the tables and figures may feel like an afterthought. But it is still important, in part because you, as a researcher, know your data better than anyone else, and also because not all readers absorb information the same way. So it’s worth your time to think about what you want to highlight (hint: go beyond the obvious statements along the lines of x was associated with y, z was not associated with y).

I hope you’ll agree it’s also important to make the Results section appealing and useful to read. Many results sections fail to provide any mention of the descriptive finds. These, however, help to put the study into context. How many people were eligible, how many participated, how many cases were observed, and what were patterns of missingness? These and similar questions immediately help the reader to understand who was studied and the quality of the evidence.

When transitioning to internal comparisons, one element to keep in mind is context. Even if you’ve done so in the Methods section, precede each result you give with a hint of what you were looking for in that step of your analysis. Just as important is the flow of language. Of course we don’t expect an epi Results section to read like Walt Whitman, but you’d be surprised how a strategy regarding the presentation of data can improve how well the reader engages with it.

I’ll start with an example of a sentence that, while not particularly long, is seriously hard work to get through:

Similar results were found for lung cancer, colorectal cancer, and breast cancer: lower consumption of jelly beans was associated with an estimated 4%-8% lower hazard ratio (95%CI 0.67 to 1.22, 0.76 to 1.34, and 0.92 to 1.13, respectively), although these estimates were imprecise.

Do you see how you have to go back and forth from the outcomes in the first line to the confidence intervals in the third to match them up, because of the “respectively” device? In addition, it’s hard to parse that range of percentages of lower risk - if there are only three outcomes, why not give just give all three? (More about the imprecise estimates below.) To simplify, keep each outcome in the same phrase as its data:

Consumption of jelly beans was associated with a 4% lower hazard ratio (95% CI 0.67, 1.22) of lung cancer, 7% lower risk (95% CI 0.76, 1.34) of colorectal cancer, and 8% lower risk (95% CI 0.92, 1.13) of breast cancer, although the estimates were imprecise.

A second concern is the use of the percentage hazard ratio. It is too easily confused with a difference estimate of association, when in fact the associations are estimated on the ratio scale. Furthermore, it has different different units than the CI, so you can’t automatically place it within the interval. An even better revision would be:

The hazard ratio associating consumption of jelly beans with lung cancer was 0.96 (95% CI 0.67, 1.22), with colorectal lung cancer was 0.93 (95% CI 0.76, 1.34), and with breast cancer was 0.92 (95% CI 0.92, 1.13) of breast cancer, although the estimates were imprecise.

Next, I hope this idea is not too radical, but consider not putting data in a sentence at all: leave the numbers in the table, if possible, and describe the results in words. That way, a reader can first read your simple summary, and then turn to the tables to pick out the details for him or herself. This strategy works best for secondary findings; results pertaining to the primary aim should always be reported with data. Revising the report of these secondary findings, the edit of the sentence would be:

Consumption of jelly beans was associated with imprecisely measured decreased hazards of lung, colorectal, and breast cancer (Table 3).

Finally, what exactly do the authors mean when they say that the estimates were imprecisely measured? The intervals were actually fairly narrow. We suspect they mean that the intervals include the null, which has nothing to do with the precision. The final, zen edit of the troublesome sentence would be:

The hazard ratios associating jelly beans with the incidence of lung, colorectal, and breast cancer were all near null (Table 3).

We invite you to look at a few outcomes papers and think about the above. Do you even read Results sections? If not, why not? What would you do differently? We’d be happy to discuss.

Take-home messages that will take you a long way toward a readable Results section:

  • Be sure to open the results section with the descriptive findings

  • As the topic sentence in each paragraph, provide a bit of context for each section of the analysis.

  • Keep the outcome with its data (avoid the dreaded “respectively”).

  • Break up long sentences containing a lot of data.

  • Be sure to use the measure of disease occurrence that you are estimating (“risk”, “rate”, “hazard”, etc).

  • For secondary findings, consider leaving effect estimates and confidence intervals out of the text altogether.

While the above recommendations are stylistic, here’s a reminder of a couple of additional requirements relevant to reporting of results in Epidemiology: Avoid causal language – verbs such as impact, affect, increase/decrease – in favor of the language of association. And avoid significance testing as follows:

  • Leave out p-values (except for tests of trend and heterogeneity, but even then do not compare with an acceptable Type 1 error rate)

  • Instead of “x was not significantly associated with y,” just say “x was not associated with y” or “x was associated with an imprecisely measured increase/decrease in y” or “the association of x with y was near null”

  • Avoid the word “significant” in non-statistical senses of the word, and instead choose from the less-loaded words “considerable,” “important,“ “material,” “appreciable,” or “substantial.”

Null results are good! We have recently published an editorial​ seeking persuasively null results. You might even edit the result in the example one step further:

Consumption of jelly beans was not associated with decreased hazard of lung, colorectal, or breast cancer (Table 3).

Sorry, jelly beans.