The Editors' Notepad

The goal of this blog is to help EPIDEMIOLOGY authors produce papers that clearly and effectively communicate their science.

Monday, November 8, 2010

Registration of observational research: a series of lively debates!
Two recent debates have addressed the registration of observational research (discussed in the September issue of this journal [1]). One debate was at the August meeting of the International Conference on Pharmacoepidemiology (ICPE) at Brighton, UK, and the second was at the September meeting of the American College of Epidemiology (ACE) at San Francisco. [Full disclosure: I took part in both.]
The idea of registering observational research was launched at the end of 2009 in a meeting organized by a group representing the European chemical industry [2] – an industry that feels epidemiology is behaving irresponsibly. Thereafter, the registration idea was enthusiastically embraced by Lancet [3] and BMJ [4] with arguments that reveal the great confusion that prevails when observational research is discussed and pitted against RCTs.
BMJ editors consider observational research 'vulnerable to bias and selective reporting': researchers 'may … craft a paper that selectively emphasises certain results, often those that are statistically significant or provocative'. In the future, BMJ will demand 'a clear statement of whether the hypothesis arose before or after the inspection of the data' (if afterwards, the journal will demand extra explanations), and they will ask 'whether the study was registered, and if registered whether the protocol was registered before data acquisition or analysis began'. BMJ’s reason is that they are interested only in papers that have clear and immediate clinical relevance.
Are we allowed to have new ideas while exploring existing data? At ICPE, the debate was about multiplicity in pharmacoepidemiology. The argument against multiple analyses of pharmacoepidemiologic data was defended by Stan Young and Stuart Pocock, based on the same reasoning that makes subgroup analyses ‘not done’ in randomized trials (RCTs). On the other side, Ken Rothman and Sonia Hernandez-Diaz argued that multiple analyses are a hallmark of good science: good science investigates several aspects of a question and is not limited to a single prespecified question and analysis. Epidemiologists learn during data analysis, in particular in large complex databases; they behave like lab scientists who adapt their experiments and change their protocols after seeing the results of the previous experiment.
Consider real epidemiology practice. Of course, we always tell our PhD students to have prespecified research questions and a prespecified plan when ‘attacking’ a data set. The reason is not to make the results more believable. The reason is to avoid getting lost in your data analysis: to know what you are doing, why you are doing it and where you came from -  just as lab scientists keep notes of their experiments in lab journals.
Almost all science starts with a preconceived idea, and a lot of science will have some protocol. Think of archeologists. They will start digging somewhere with an idea in mind – otherwise they would not get funded. Suppose that while working at the terrain, they notice that the strange shape of the next hill is also promising. After a test dig, artefacts are found. Are they 'data dredgers' whose findings should be treated with suspicion?
At ACE in San Francisco, the debate session was about registration of observational research. The 'pro' position was defend by Douglas Weed, of DLW Consulting Services [5] who largely approved of the document of the chemical industry. In Weed’s view, true transparency was an obligation to society and meant making protocols available beforehand. On the other side, Richard Rothenberg (editor of the Annals of Epidemiology) felt that for journals to require registration would promote standardization and restrict an editor's mandate to foster innovation and creativity. I also spoke against registration, based on the premise that RCTs – which seem to be guiding beacons – are, in fact, scientifically the 'odd man out'. RCTs try to avoid multiple and post hoc analyses at all costs. These safeguards are necessary for the credibility of the small number of RCTs that usually suffices for drug approval. Indeed, the whims of an investigator who sees something interesting in the data of a single trial should not bear on medical decisions that have consequences for millions of patients. Registration of RCTs was set up as a stringent measure to avoid selective reporting, and rightly so.
Recently, Mark Parascandola defined 'epistemic risk': “In drawing an inferential conclusion or accepting a hypothesis as true, one takes on an ‘epistemic risk’ – the risk of being wrong.” [6]. The RCT procedure can be seen as minimizing epistemic risk – that is, minimizing the risk of a wrong answer for the key question.  However, minimizing type I error increases type II error, and hence prevents us from seeing new things. It is not clear which error (type I or II) is the worst when we try to explain Nature. Much good can come from an idea that initially lacks strong support, or that seems at first ‘useless’, or that while wrong leads to new insights. Maximal avoidance of type I error is contrary to an important aim of science: to discover new explanations.
What seems to be happening is that the mantra of ‘type I error avoidance’ that serves RCTs so well, is now indiscriminately carried over to observational research. When the BMJ editorial is followed to the letter, any new idea that occurs during data analysis should be registered first – and even then the researcher is cheating, since the idea occurred after seeing the data.
The support of Lancet and BMJ for registration rests on the premise that all sciences should behave like RCTs. Imagine telling a theoretical physicist, an evolutionary biologist, a molecular biologist or an astronomer that she should not publish any thought or finding other than the ones she had in mind several years earlier!  Science requires publication of those insights that seem to carry us forward - not the whole history of all wrong ideas, mishaps and detours. The acceptance of your paper will come from others who explore the consequences of your ideas, and who look for alternative explanations (like bias and confounding). Often, this is a long process. When alternative explanations are ruled out in a credible way, observational data may lead to action – even regulation - as much as RCTs. Whether a particular hypothesis or analysis was prespecified plays no role in that process.
The debate on registration of observational research touches on the fundamentals of how scientific progress is made. No real surprise that this will be different for different sciences. That makes these debates interesting and exciting.
More debates are forthcoming. The next one that I know of is on 14 December 2010 at the Amsterdam Medical Center in the Netherlands, where the lecturer is Kay Dickersin, Director of the US Cochrane Center at Johns Hopkins. She has published extensively about selective publications that may wreck meta-analyses of RCTs. Rumour has it that there are budding plans to bring up the topic at the 3rd North American Congress of Epidemiology in Montreal in 2011, as well.
If you like to comment, Email me directly at or submt your comment via the journal which requires a password protected login.
[1] These EPIDEMIOLOGY Commentaries are freely available at
[2] Workshop: Enhancement of the Scientific Process and Transparency of Observational Epidemiology Studies, 24 –25 September 2009, London. Workshop Report No. 18, Brussels, November 2009, European Centre for Ecotoxicology and Toxicology of Chemicals. Available at:
[3] The editors. Should protocols for observational studies be registered? Lancet. 2010, 375:348.
[4] Loder E. Groves T, MacAuley D. Registration of observational studies: The next step towards research transparency. BMJ. 2010;340:375–376.
[5] DLW Consulting Services:
[6] Parascandola M. Epistemic risk: empirical science and the fear of being wrong. Law, Probability and Risk 2010:9; 201-214  doi:10.1093/lpr/mgq005
© Jan P Vandenbroucke, 2010