The Editors of Epidemiology
The goal of this blog is to help EPIDEMIOLOGY authors produce papers that clearly and effectively communicate their science.
Monday, June 6, 2016
My inaugural post to this blog discusses abbreviations and how we treat them at EPIDEMIOLOGY: mostly, I’m afraid, we avoid them, as you’ll know if you have worked with me. But today, I am happy to explain why. Epidemiologists, we are in this together.
In my role as Deputy Editor, also known as Science Wordsmith-in-Chief, I spend more time considering and (usually) spelling out abbreviations than on any other class of edits. That’s because, in addition to scientific accuracy, a top goal is to deliver papers that are clearly written and as effortless for our target audience to read as possible.
And as someone with an epidemiology PhD whose training may have gotten a little rusty, I may be a useful test case. I’m sure, for some of you, reading a methods paper is like falling off a log. You do this stuff all the time. You can glance briefly at a formula consisting of stacks of Greek letters meaningfully embellished with bold and italics, and the concept behind a method for correcting for selection bias crystallizes in your mind in three dimensions. Similarly, a new regression model with a 10-syllable name attached to a 10-letter abbreviation sticks firmly in your mind. I know, because I trained with many of you and now I read and am impressed by your papers…which I have to read slowly. I envy you a bit, but never mind: mainly, I want to learn what you to have to offer.
But because I don’t get to spend most of my days immersed in methods and biostatistics, it’s helpful to have an unfamiliar abbreviation spelled out each time it’s used. Our readers and I sometimes have to work to decipher and internalize the concept behind the method. Our work is easier when we can avoid thinking ‘Wait, what does that stand for?’ and having to scroll up, find, and re-read the definition…and usually lose the train of thought.
Overall, spelling out abbreviations helps forward our goal of publishing epidemiology papers that read like English, not like jargon. Therefore, please think of your wider community of colleagues and spell it out—our rule of thumb is whether it would be understandable to someone outside your subspecialty. If you don’t, I will, and rather than use search-and-replace I will do it each time individually and look for ways to avoid wordiness and awkward phrasings that sometimes arise. However, it does take time, and really, I suspect you can do it more smoothly and accurately than I can, if you do so as you write.
We understand there are other reasons you might want to use abbreviations. For example:
* To popularize a new method. We sympathize. But if the name of a method is really unwieldy when spelled out, an acronym will naturally evolve, and there may be workarounds (see below). Meanwhile, as above, allowing broadly trained epidemiologists access via conceptual transparency that avoids the hard work of repeated scrolling up to a definition, can also accomplish the goal of popularizing it.
* It’s the shorthand you use within your research team.
* To meet the word limit. Sorry, but you’re busted, and my colleagues who write a lot assure me there is always a way to shorten a paper that does not compromise clarity.
* To avoid typing. Really? OK, never mind, I can’t believe you would do this.
Meanwhile, there are additional reasons to spell out:
* To avoid ambiguity. As an example, MSM abbreviates “men who have sex with men” to one community of epidemiologists and “marginal structural modeling” to a second community. For a reader who is not an enshrined member of either community, the abbreviation is ambiguous without context to help.
* To make sentences flow better. Many abbreviations are more awkward to read and pronounce than their spelled-out forms.
* To avoid bureaucracy-speak, which is not a recognized dialect of English. Those who work for large government agencies should be particularly able to relate to this.
So, when will we allow an abbreviation?
* When it is likely to be familiar and unambiguous to most epidemiologists - I understand this is a judgment call, and in some cases my thinking has evolved.
* When it is impossibly unwieldy to read when spelled out.
* When it is used as a variable name in an adjacent equation (in which case it will also be italicized).
* In tables and figures, to help save space, but it must also be defined in a legend or caption.
* For study names and similar proper nouns.
If spelling out is moderately wordy or unwieldy, I will try to find a workaround (for example: ‘hereafter referred to as…’), such as a partial spelling out, or using pronouns. And finally, I often don’t make these decisions unilaterally, and will check with other editors.
Sunday, November 27, 2011
The recent publication in EPIDEMIOLOGY of a graph about semen quality over time  - data that were somehow buried in a governmental report in Denmark - again raises the much-debated point of public access to data [2, 3, 4].
The mere fact of questioning a policy of public access to data, seems like being ‘against motherhood and world peace’. Isn’t it true that “Science is about debates on findings,” “Science serves people, and people (taxpayers) paid for it,” and “Expensive research data should become available to others”? Yet, the issues are more complex than the simple idea that ultimately we will all benefit from open access to data.
Firstly, what is meant by ‘data’? The original unprocessed MRI scans, blood, tissue, questionnaires? Or the processed data – determinations on blood, coded questionnaires? The cleaned data - with the possibility that the authors already have ‘massaged’ inconveniences? The analysis files – in which the authors have extensively repartitioned and recoded the data (another round of subjective choices)? Data should be without personal identifiers – of course – but in our digital age people can be identified by combinations of seemingly innocent bits of information. And, finally, should all discarded analyses, or discarded data, also become publicly available – to check what the authors ‘threw way’ and whether their action was ‘legitimate’?
Secondly, to what extent is the public as the taxpayer, or any organization that pays for the research, really the full owner of the data? Data exist because of ideas about how to collect and organize them. There is intellectual content, not just by the researchers, but also by their research surroundings, their departments, universities, and governmental organizations that make research intellectually possible. Data in themselves are not science. Giving your data to someone else is not an act of scientific communication. Science exists in reducing data according to a vision - some of which may develop during data analysis. Should researchers not have a grace period for the data they collected, or perhaps two: first a period in which they are the sole analysts, and then a period in which they share data only on conditions?
Thirdly, how protective can a researcher remain about her data? Should a researcher have the right to deny access to her data to particular other parties? Richard Smith, the former editor of the BMJ, stated in his blog that denying access is a wrong strategy – why fear open debate, it will only lead to better analyses? In his opinion, one should not deny data access even to the Tobacco Industry .
Reality is different: researchers know that when a party with huge financial interests wants access to data, there are three scenarios.
Scenario 1: they search and find some error somewhere in the data. This is always possible –no data are error-proof. The financially interested party will start a huge spin-doctoring campaign, proclaiming loudly in the media that the data are terrible. Remember the discussions on the climate reports?
Scenario 2: another analyst is hired by the interested party, and comes to the opposite conclusion. This is published with a lot of brouhaha. The original researcher writes a polite letter to the editor, explaining why the reanalysis was wrong. The hired analyst retorts by stating that it is the original analysis which was in error. Soon, only the handful of people who really know the data can still follow the argument. That is the signal for a new wave of spin-doctoring, in which medical doctors give industry-paid lectures stating that “even the experts do not know any more; we poor consumers should use common sense; most likely, nothing is the matter”. I witnessed this scenario in a controversy on adverse effects of oral contraceptives. A class action suit was deemed unacceptable by a UK court because, in a meta-analysis in which two competing analyses of the same data were entered (!!), the relative risk was 1.7. This number fell short of the magical 2.0, which is wrongly held by many courts as proof that there is ‘more than 50% chance’ that the product caused the adverse effect . Without studies and reanalyses directly sponsored by the industry, the overall relative risk was well over 2.0 . This was money well spent by the companies!
Scenario 1 and 2 have a name: “Doubt is our product” as it was originally coined by the tobacco industry: it is not necessary to prove that the research incriminating your product is wrong – nor that the company is right – it suffices to sow doubt. 
Scenario 3 is that the financially interested party subpoenas the researcher to testify over all parts of allegedly questionable aspects of the data in court. Detail upon detail is demanded. The researchers lose months (if not years) of research and their personal life. That scenario was played out against epidemiologists who did not find particular adverse effects of silicone breast implants . It is recently feared again as the next strategy by the tobacco industry in the UK .
Advocates of making data publicly available seem to live in an ideal dream world, in which for every Professor A whose PhD students always publish A, there exists a Professor B whose PhD students publish B. Such schools of thought combat each other scientifically with more or less equal weapons. Other scientists watch this contest and make up their mind as to who has the strongest arguments and data. This type of ‘normal science’ disappears when strong financial incentives exist. Then the weapons are no longer scientific publications, but public relations agents and lawyers. Of course, also in ‘normal science’, there are rivalries that can be strong. It happens that researchers do not want to share their complete data, or only part of the data under conditions. Often this is for the very simple reason that some sources of data, like blood samples, are finite.
Calls for making data publicly available need to take into account these scenarios. Some people hope that open information in the long run provides the ‘real’ truth. But in a shorter timescale, open information may also allow mischief by special interests, with plentiful resources, that are ruthless in their attempts to shape public policy. It seems difficult to ‘experiment’, i.e. to try open access to data for some time and then turn it back when the drawbacks seem too great.
An intermediary solution might be much more easy to implement. Tim Lash and I, following ideas of others, have proposed to make public registries of existing data . This would make it possible to start negotiating with the owners of the data about possible re-use. Such a registry might also facilitate the use of data in ways that were not originally planned. If controversy and distrust complicates the picture, trusted third parties can be sought to organize a reanalysis, with public input possible – a strategy recently proposed by a medical device maker .
In short, public access to data is much more complex than the proclamation of some principles that look so wonderfully scientific that nobody can argue against them.
Commentaries about this topic are greatly welcome. They can be published a full guest blog of about 450 words maximum. Please mail to Epidemiologyblog@gmail.com
Note: an earlier version of this blog was published as an opinion piece in the Dutch language newspaper NRC-Handelsblad in the Netherlands on 12 October 2011
© Jan P Vandenbroucke, 2011
Wednesday, September 21, 2011
Scientists often portray themselves as the noble but hapless victims of sensationalism and exaggeration in the popular media . But are scientists in fact sometimes complicit in these abuses, hyping their work in media interviews, making claims that would not survive peer review in the published articles? If so, this constitutes an important ethical violation that deserves further scrutiny, since communication with the public is at least as socially consequential as communication between scientists. Public opinion plays a long-term role in funding levels for competing research programs, for example, which makes exaggeration in news stories a serious abuse of the power granted to the scientist by a credulous and trusting media and public.
Here’s one example I came across recently which may fit this description. Nature Genetics published a meta-analysis by Dara Torgerson and colleagues in their September issue. The authors pooled North American genome-wide association studies of asthma that included over five thousand cases, including individuals of European, African and Latino ancestry . They reported a number of susceptibility loci, most of which showed similar associations across ethnic populations and had been previously described. But one variant was novel, and the association was described as being specific to individuals of African descent. Table 2 of the paper reported a SNP near the gene PYHIN1 on chromosome 1 with an odds ratio (OR) among African Americans and Afro-Caribbeans of 1.34 (95% CI: 1.19-1.49). In a replication data set, this association remained substantial (OR=1.23), although at a slightly different locus. For European Americans, the corresponding association for this SNP was reported as “NA”, which a footnote defined as “not available (the SNP was not polymorphic).” As noted by the authors, this finding is potentially interesting and important because of the substantial racial/ethnic disparity in asthma prevalence in the US (7.7% in European Americans versus 12.5% in African Americans).
Although the main text of the paper reports only the odds ratios and their confidence intervals, Table 1 on page 18 of the electronic supplement details the allele frequencies by group. Surprisingly, it is the minor allele, which was not observed in European Americans, that is associated with lower risk. The major allele had reported prevalences of 77.0% and 71.9% in African-origin cases and controls, respectively. There is no association in European Americans because 100% have the major allele. If this SNP is taken to be causal, therefore, the pattern for this variant would be opposite of the observed disease phenotype prevalences, with 100% of European Americans having the high risk variant. Under the more likely interpretation that the SNP is a marker in linkage disequilibrium with a causal variant in the gene PYHIN1, however, the data have nothing at all to say about PYHIN1 and asthma in European Americans. The authors would have a basis to consider the unknown variation in PYHIN1 as explaining some cases of asthma within the African-origin population, but no claim to this being relevant in any way to racial/ethnic disparities. European Americans might have more or less of the high risk version of this gene; the data are completely silent on this issue.
It came as a surprise, therefore, to see the news reporting on this publication. For example, the Reuters story published on July 31st began "U.S. researchers have discovered a genetic mutation unique to African Americans that could help explain why blacks are so susceptible to asthma."  The story seemed to portray the SNP as the causal variant itself:
"But because the study was so large and ethnically diverse...it enabled the researchers to find this new gene variant that exists only in African Americans and African Caribbeans. This new variant, located in a gene called PYHIN1, is part of a family of genes linked with the body's response to viral infections, Ober said. "We were very excited when we realized it doesn't exist in Europe," she said."
How can one make sense of this text in relation to the published paper? If the reported SNP is by some great stroke of luck the causal variant itself, then it cannot explain the observed racial/ethnic disparity since it would lower risk in some blacks in relation to whites. If, on the other hand, the SNP is merely a marker for a causal variant somewhere nearby, presumably in PYHIN1, then it is nonsense to say of this unknown variant that it “doesn’t exist in Europe.” The data reveal nothing at all about the distribution of this variant in European Americans since no marker for this gene was found in that population. Either way, therefore, the news story did not seem to reflect the data that were reported in the article.
Thinking that this was an example of the press being irresponsibly sensationalistic, and misrepresenting the peer-reviewed article, I sent a letter on August 2nd to the Reuters science reporter and editor, signed by myself and about a dozen colleagues. We also sent a copy of the letter to the corresponding author of the article, the University of Chicago statistical geneticist Dan Nicolae.
The Reuters editor sent a detailed response without delay. She reviewed the statistical significance of the association measure and the proposed biological mechanism for how the PYHIN1 gene might affect asthma risk, and noted that the science reporter’s text was supported by interviews with two of the researchers as well as from a contact at the National Institutes of Health. To document this, she attached an e-mail from two of the authors, Dan Nicolae and Carole Ober, in which they affirmed their approval of the coverage their work had received. “First let us say that we think that the article is very well written and we have no major issues with it. We do not understand the issues raised in Dr. Kaufman's letter,” they wrote. They went on to note that perhaps the Reuters title might “slightly overstate the conclusion of our study”, but that it was a “subtle distinction” at best. “We thank you for helping us promote our science,” they concluded.
I then wrote to Dan Nicolae directly, asking him how the Reuters text could be construed to be consistent with the information in the paper. “I understand that race is a sensitive issue, subject to many debates,” he responded. “My research is on understanding molecular mechanisms of complex diseases, with the hope that this will lead to better treatments. It has nothing to do with this debate. On the Reuters news item, let me state that there are several scenarios where our data would fit with that headline. I will not discuss these scenarios here because I am convinced they will produce other discussion, and I prefer to use my time on my research projects.”
Apparently, Dr. Nicolae was comfortable that the Reuters reporting did not reflect the content of the paper because he believed that there were theories, not explored in the published article, which could make the news story valid. On the basis of his reply, I came to believe that this incident was not the result of a science reporter misunderstanding the published paper. Rather, it seemed to be the case of the scientist providing a speculative interpretation that was not vetted by the reviewers or the editors of the journal. Dr. Nicolae offered that my confusion may have arisen from ignorance, and recommended that I read up on tag SNPs, differences in linkage disequilibrium patterns between Europeans and Africans, and association signals produced by interactions. “These will lead you to these scenarios I am referring to,” he concluded.
While it is possible for a risk factor to operate in different directions across two populations, this entirely sidesteps my concern, which is that the reporting strayed from what could be said based on the content of the published article. There could be no evidence of effect measure modification presented for this variant, since there was no exposure variation in the European Americans, and therefore no association measure could be estimated in that group. Dr. Nicolae did not appear to disagree with me on this point, but seemed to view the media interview as an opportunity for presenting his research program as relevant to racial disparities in a way that could not be directly derived from the published data. This is surely a fine line, because journalists often want scientists to give their expert opinions on the broader interpretation of the published work. But how far should authors go in describing what they might speculate to be true, rather than what they actually found? The impetus for the news story was the publication of an article in a respected scientific journal. Are there really no constraints on how far authors can extend their interpretation while claiming to be referring to the article? Should they clearly indicate that they are speculating - and should they also present at the same time the potential contrary or skeptical view? With so much attention and funding riding on efforts to understand and reduce minority excess burden of disease, the authors’ speculation risks the appearance of being self-serving. If scientists sometimes disparage science reporters as the source of popular misinformation, the fair reply might therefore be “Cura te ipsum!”
If you like to comment, Email me - or in this case Dr Kaufman - directly at firstname.lastname@example.org or submit your comment via the journal which requires a password protected login. Unfortunately, published comments are limited to 1000 characters.
 Torgerson DG, et al. Meta-analysis of genome-wide association studies of asthma in ethnically diverse North American populations. Nat Genet. 2011 Jul 31;43(9):887-92. doi: 10.1038/ng.888.
Tuesday, June 28, 2011
Judging from the recent spate of articles about false-positive results in epidemiology, initiated by Ioannidis’s thought-provoking piece in our July issue , something seems wrong with our field. Still, the mainstay of epidemiologic reasoning is making comparisons. Hence the question: has our field more problems than other fields of medical science?
Let’s start with randomized trials. And, just as critics of epidemiology focus on specific fields (say, nutritional and life style), let’s also focus – for example, on antidepressants. John Ioannidis produced another thought-provoking paper, entitled: “Effectiveness of antidepressants; an evidence based myth constructed from a thousand controlled trials."  Right, that seems to settle the score with epidemiology.
Would this problem exist for anti-depressant randomized trials only? The people who first signalled the selectiveness of the evidence construction on antidepressants were employees of the Swedish drug licensing authorities. They stated that in their experience the problem is not confined to this class of drugs.  Just read the extremely funny paper about a new class of antipsychotics which showed that in head-to-head randomized trials Drug A was better than B, B better than C, and in turn C better than A – all for the same indications!  The outcome just depended on the sponsor. All in all, should we be mistaken in concluding that the field of randomized trials has as many problems with its credibility as epidemiology?
Let’s continue with Genome Wide Association Studies (GWAS). A paper in The Economist called the whole genomic scene into question with the title: “The looming crisis in human genetics” with subheading: “They have simply not been delivering the goods”.  The crucial sentence: “Even when they do replicate, they do never explain more than a tiny fraction of any interesting trait”.
Of course, we, epidemiologists knew this all along. Let’s focus on cancer. In a 1988 (!) study of adopted children and their biological and adoptee parents in Scandinavia,  the relative risk of developing any malignancy by children whose biological parent had developed a malignancy before age 50, but who had been educated in adoptive families, was 1.2 – while the relative risk if the adopting parent had developed cancer became 5-fold. A later editorial made clear that only rather weak concordance of cancer was found in twin studies.  Mere logic should have alerted us that it is impossible to find strong explanations of complex diseases by single somatic genes: these diseases come into being because of multiple pathways that go wrong, and each pathway can go wrong in multiple ways. Of course, there are rare families where cancer is hereditable – these have been detected, and that was it.
We pass the fact that almost none of those great GWAS discoveries has yet resulted into anything meaningful clinically or to public health – in great contrast to the record of epidemiology. And, as the Economist wrote, the yield of classic genetics has been much higher than GWAS, and clinically much more important. My only (but rather successful) brush with genetics concerned a mutation that was quite prevalent with a high relative risk, and that was discovered by first elucidating the biochemical abnormality and thereafter reasoning backwards to the gene.  GWAS would never have found it. If Ioannidis calls antidepressant randomized trials a well-constructed myth, should we not call GWAS the same?
For almost any field of medical science, we can point out easily that its published record must contain a massive amount of irrelevance and error. This is nothing to worry about. It is normal science – it has always been like that and it will continue to be so. Again, John Ioannidis comes to the rescue and proves the point. A personal parenthesis first: for years, during lectures, I had been telling audiences that if you want to understand how science evolves you should go to the library and look at The Lancet or BMJ or NEJM or JAMA of 50 years ago, or better 100 years ago – most of the papers you cannot understand anymore, and the rest are either irrelevant or plainly wrong. I only did the thought experiment, and never even published it, but John Ioannidis and his collaborators gathered real data, and confirmed my prejudices. In a paper entitled: “Fifty-year fate and impact of general medical journals”,  they write “Only 226 of the 5,223 papers published in 1959 were cited at least once in 2009 and only 13 of them received at least 5 citations in 2009.” They were mostly clinical papers, describing syndromes.
Perhaps, with the latter publication Ioannidis is biting his own tail. I am thinking about his 2005 “Why most published research findings are false”.  In that paper he seemed to single out observational epidemiology as the main culprit. Judging from his later judgement about myth creating by ‘a thousand RCTs’ and his recent judgement about the transiency of all science, the ultimate question becomes whether all scientific processes can be improved – not just epidemiology - so as to be less wasteful and yield more often truth.
Having just returned from the 3rd North American Congress of Epidemiology in Montreal, I started wondering whether it is epidemiologists who are most acutely aware of the tentativeness of any scientific finding. At that congress you could go from one session to the other hearing about problems of data, analysis and inference and listen to plenary lectures about wrong turns in our science. Would the same happen at, say, a congress of cardiologists? Or geneticists?
In 1906 Sir William Osler delivered an Harveian oration about“The growth of truth” in which he wrote about the vagaries of truth, and the many detours and false alleys of scientific research: “Truth may suffer all the hazards incident to generation and gestation…”.  His views were echoed almost 100 years later in a Millenium essay by Stephen Jay Gould who described how science progresses “in a fitful and meandering way”.  Perhaps the wastefulness of science is inevitable and might be compared to the zillions of meaningless mutations happening in biological systems – very few of which might carry any survival advantage.
One day, when discussing a paper with our PhD students, one asked in exasperation: “How can you ever be certain that a paper is true”. My spontaneous answer was: “Grow 25 years older – and even then…”.
If you like to comment, Email me directly at email@example.com or submit your comment via the journal which requires a password protected login. Unfortunately, comments are limited to 1000 characters.
 Ioannidis JP, Tarone R, McLaughlin JK. The False-positive to False-negative Ratio in Epidemiologic Studies. Epidemiology. 2011;22:450-6.
 Ioannidis JP. Effectiveness of antidepressants: an evidence myth constructed from a thousand randomized trials? Philos Ethics Humanit Med. 2008;3:14.
 Melander H, Ahlqvist-Rastad J, Meijer G, Beermann B. Evidence b(i)ased medicine--selective reporting from studies sponsored by pharmaceutical industry: review of studies in new drug applications. BMJ. 2003;326:1171-3
 Heres S, Davis J, Maino K, et al.. Why olanzapine beats risperidone, risperidone beats quetiapine, and quetiapine beats olanzapine: an exploratory analysis of head-to-head comparison studies of second-generation antipsychotics. Am J Psychiatry. 2006;163:185-94.
 Miller G. The looming crisis in human genetics. In: The World in 2010. The Economist, November 13, 2009: page 150-51.
 Sørensen TI, Nielsen GG, Andersen PK, Teasdale TW. Genetic and environmental influences on premature death in adult adoptees. N Engl J Med. 1988;318(12):727-32.
 Hoover RN. Cancer--nature, nurture, or both. N Engl J Med. 2000;343:135-6.
 Vandenbroucke JP, Rosendaal FR, Bertina RM. Factor V Leiden, Oral Contraceptives and Deep Vein Thrombosis. IN: Khoury JM, Little J, Burke W. Human Genome Epidemiology. New York, Oxford Univ Press 2004: 322-332.
 Ioannidis JP, Belbasis L, Evangelou E. Fifty-year fate and impact of general medical journals. PLoS One. 2010;5(9). pii: e12531.
 Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2(8):e124.
 Osler W. Harveian oration. The growth of truth, as illustrated in the discovery of the circulation of the blood. BMJ 1906;ii:1077–1084.
 Gould SJ. Pathways of discovery: deconstructing the “science wars” by reconstructing an old mold. Science 2000:287:253–261.
© Jan P Vandenbroucke, 2011
Sunday, April 10, 2011
A health care initiative calling for comparative effectiveness research (CER), with US$ 1.1 billion initial funding was one of the most remarked early actions of the newly elected Barack Obama in 2009 . The measure has now come into law, and epidemiologists and methodologists are jumping on the bandwagon - eager to contribute to a new era in health care where decisions on the worth of treatments should be based rationally on numerical evaluations – and perhaps also with an eye on research funding. The series of papers in the May 2011 issue of EPIDEMIOLOGY attempts to jump-start a discussion about CER. The new ideals are a reincarnation of France’s 1830s movement of “Médecine d’Observation”  – but even more worth to enthusiastically strive for in the early 21st century.
‘Haven’t we all always been CER researchers?’ – is the gist of Miguel Hernán’s commentary . Yes, we have – but up to now epidemiologists have mostly covered the easy part: the adverse effects of medical treatments. In adverse-effects research, confounding by indication is mostly absent because adverse effects are usually different diseases (with different risk factors) from the one that is treated – and quite often unpredictable. Confounding by contra-indication , if present, can often be described in a few prescribing rules that may lead to successive restrictions during data-analysis . Thus, in adverse-effects research, restrictions and a careful choice of comparators and (where necessary) ”new users”  leads to quite credible “expected exchangeability” of patient groups. Such research has the added advantage of being more generalizable than randomized trials, which are limited to selected populations .
Classic papers by methodologists as diverse as Rubin  and Miettinen  have outspoken messages: “confounding by indication” in medical research on the intended effects of treatments is tractable only by randomization. The whole Evidence-Based Medicine movement, as well as the Cochrane Collaboration, are built on this very idea. Both tried to revolutionize medicine at the end of the last century. If randomization is the only solution to confounding by indication, then the prospects of CER are severely crippled -- CER would be limited to adverse-effect pharmacoepidemiology – which is indeed what we have always done.
However, the main aim of CER, as explicitly announced by Obama himself, is to compare effectiveness of drugs in daily practice . So it is no surprise that, in an earnest effort to join forces to change health care (and to bring the US closer to what is happening in Europe, e.g., in NICE [2, 11]), people from all sides are enthusiastically trying to nibble away at these classic notions. Admittedly, when confounders are few and easily measured precisely (as in the example of sequential CD4 counts and HIV treatment ), the classic papers have been proven wrong. However, in other instances, when judgments about prognosis of patients are complex and may include hard-to-quantify characteristics like “degree of oedema,” or “impression of frailty” , it has been shown repeatedly that confounding by indication remains “a most stubborn bias.” [14, 15].
Should we give up in advance, or should we see how far we can get in attempting what was judged impossible: to evaluate the beneficial effects of treatments by non-randomized studies? I have strong sympathies with people who make the attempt. Epidemiology is an evolving discipline that makes progress. Think about our insights about confounding, and about case-control studies that were revolutionized in the late 1970s and early 1980s, and then again over the last decade. Still, it is likely that in most instances mere statistical adjustment for confounding will not suffice to replace randomization. We should explore techniques that promise to address unmeasured confounding by indication, such as instrumental variables or severe restrictions, which can help in particular circumstances that should be defined. However, severe restrictions may wreck another ideal of CER: to show what works in daily practice for a wide array of patients. So – we should explore how far we can push observational epidemiology, we should seek to develop new methods, but we should keep an open mind for the possibility of failure. Whatever one’s hopes or enthusiasms, the classic papers may still be right. Clinical trialists have already predicted that CER will lead to a lowering of standards of evidence because of “data mining.” If, on the other hand, CER succeeds, Obama’s presidential legacy will include a change of epidemiologic theory.
If you like to comment, Email me directly at firstname.lastname@example.org or submit your comment via the journal which requires a password protected login. Unfortunately, comments are limited to 1000 characters.
 Vandenbroucke JP. Evidence-based medicine and "médecine d'observation". J Clin Epidemiol 1996;49:1335-8.
 Hernán MA With great data comes great responsability: publishing comparative effectiveness research in Epidemiology. Epidemiology 2011;22:290-291.
 Feenstra H, Grobbee RE, in't Veld BA, Stricker BH. Confounding bycontraindication in a nationwide cohort study of risk for death in patients taking ibopamine. Ann Intern Med 2001;134:569-72.
 Schneeweiss S, Patrick AR, Stürmer T, Brookhart MA, Avorn J, Maclure M, Rothman KJ, Glynn RJ. Increasing levels of restriction in pharmacoepidemiologic
database studies of elderly and comparison with randomized trial results. Med Care 2007;45(10 Supl 2):S131-42.
 Ray WA. Evaluating medication effects outside of clinical trials: new-user designs. Am J Epidemiol 2003;158:915-20.
 Vandenbroucke JP, Psaty BM. Benefits and risks of drug treatments: how to combine the best evidence on benefits with the best data about adverse effects. JAMA 2008;300:2417-9.
 Rubin DB. Bayesian inference for causal effects: the role of randomization. Ann Statistics 1978;6:34-58.
 Miettinen OS. The need for randomization in the study of intended effects. Stat Med 1983;2:267-71.
 Rawlins M. De testimonio: on the evidence for decisions about the use of therapeutic interventions. Lancet 2008;372:2152-61.
 Sterne JA, Hernán MA, Ledergerber B, Tilling K, Weber R, Sendi P, Rickenbach M, Robins JM, Egger M; Swiss HIV Cohort Study. Long-term effectiveness of potent
antiretroviral therapy in preventing AIDS and death: a prospective cohort study. Lancet 2005;366:378-84.
 Stürmer T, Jonsson Funk M, Poole Ch, Brookhart MA. Nonexperimental Comparative Effectiveness Research Using Linked Healthcare Databases. Epidemiology. 2011;22:298-301
 Bosco JL, Silliman RA, Thwin SS, Geiger AM, Buist DS, Prout MN, Yood MU, Haque R, Wei F, Lash TL. A most stubborn bias: no adjustment method fully resolves
confounding by indication in observational studies. J Clin Epidemiol 2010;63:64-74.
 Stukel TA, Fisher ES, Wennberg DE, Alter DA, Gottlieb DJ, Vermeulen MJ. Analysis of observational studies in the presence of treatment selection bias: effects of invasive cardiac management on AMI survival using propensity score and instrumental variable methods. JAMA 2007;297(3):278-85.
 Djulbegovic M, Djulbegovic B. Implications of the principle of question propagation for comparative-effectiveness and "data mining" research. JAMA 2011;305:298-9.
© Jan P Vandenbroucke, 2011