Because we’ve written recently about figures, it seemed a good time to address the other aspect of data presentation: tables. If you’ve been reading this blog regularly, it won’t surprise you to know that our first priority for tables in Epidemiology is clarity. And if you’ve read from the beginning, you may remember that I once confessed to my rusty, out-of-date knowledge of methodology.
Well, I have another confession: I find it hard to look at big, cluttered tables of numbers. When I was analyzing data for my dissertation, I would have to extract data from SAS crosstab output that looked like thiswere cluttered and complicated, and I would get really anxious. I wasn’t sure which numbers I wanted, I often grabbed the wrong one, and sometimes I would switch digits. I often asked a colleague to check my work. To this day, I find it challenging to check table data versus text when the tables are not laid out clearly.
Reasonable epidemiologists can differ on their general preference for presenting data in figures versus tables. At Epidemiology, though, if a figure and a table convey the data equally effectively, we prefer you use a table to gain the advantage of the exact data. However, for the reason I mentioned just above, the visual characteristics of a table matter, and journal style and statistical requirements play into table presentation as well, so here are our recommendations:
In general, aim to have a table fit on no more than one whole vertical page; a page in Epidemiology is about 275mm x 200 mm, minus margins. Tables can spill over onto another page, or be printed at 90 degrees to the main text, but such scenarios (for example the short but wide table) often end up wasting space on the page. That said, a full-page table is pretty big, and we always appreciate your willingness to manage the size and number of tables by putting data in supplementary digital content. In a recently published meta-analysis, half of the vertical space in a forest plot (more of a hybrid table-figure) consisted of lettered footnotes indicating adjustment variables. The author was willing to put these footnotes in SDC.
A main concern when we edit tables is that the authors have been too comprehensive. For example, we often receive tables that show the same estimate of association from several different models (e.g, crude, adjusted for age and sex, and adjusted for a more comprehensive set of control variables). We prefer that authors decide which model is most valid and precise, and present only that result. This brevity may allow a table to be deleted altogether, with results presented only in the text. The detailed information can always be presented in the SDC for readers with interested in a more comprehensive view. The main point is that authors should evaluate the information in their tables to assure that every included result is important for the reader to comprehend the study’s results. Information extraneous to that goal should be deleted or included as SDC.
Abbreviations and footnotes
Because of the limited size of column and row headings, we can often be more flexible about allowing abbreviations that in the main text we would ask you to expand, as long as they are defined in a non-lettered footnote.
Other than the one footnote that defines abbreviations, footnotes should follow our convention of superscript letters - a,b,c - rather than numbers (which may be confused with bibliographic citations) or symbols.
Precision and interpretability
We prefer two significant figures for most numbers, except when substantial statistical power allows more; we will flag significant figures exceeding two, but we do leave the final decision to the author. When we say two significant figures, we mean independent of the decimal, so:
XX, X.X, 0.XX
In other words, significant figures is not equivalent to decimal places. This holds for percentages, descriptive statistics, and risk estimates, including those on a log scale.
A special note for descriptive tables, typically Table 1. We are aiming for tables that don’t mix categorical and continuous data. The most useful statistics for describing the distribution of a variable is quantiles, usually quartiles that include a median. Mixing categorical and continuous data can also make a table confusing to read. For these reasons, we prefer authors report any means and standard deviations (SD) in the text. We understand that, when presenting descriptive statistics within subgroups, it may be unwieldy to move means and SDs to text; a possible solution is to group continuous variables separately from categorical ones, so that the respective statistics can be clearly labeled.