When I was approached by the Editor to write this commentary, I had mixed feelings. On the one hand, grappling with the many complex issues that surround the role of “big” and “very big” studies in epidemiology (B/VBE) has increasingly become important given the number of national and international initiatives, both underway and proposed, to conduct such studies. On the other hand, the debate on these issues raises disquieting images reminiscent of both Tom Hanks' movie Big and Jonathan Swift's Gulliver's Travels. As you may recall from the movie Big, the protagonist—Josh, a 13-year-old boy—is transformed by a coin-operated fortune teller named Zoltar into a grown man. Innocent and inexperienced as he is, Josh finds that, although the experience of being grown-up does have some advantages, he longs to return to being a kid. From Swift we recall the perils and rewards of being too big in Lilliput and too small in Brobdingnag and hope that we will not become epidemiologic “Yahoos.” Without the aid of Zoltar, the fortune teller, we have no way of predicting the consequences of epidemiology becoming “big,” but Swift's satirical insight does forewarn us that one size definitely does not fit all and that size may impact our vision.
Over the last few years, I have attended or participated in numerous conferences and symposia on the topic of “big epidemiology,” have discussed the prospects and limitations of such studies with many colleagues, have participated in some of the planning for a “big” prospective national study, and have followed the considerations of groups such as the Secretary's Advisory Committee on Genetics, Health, and Society and its Task Force on Large Population Studies. Although I do not want to assert that I am an expert on the subject, I have developed considerable appreciation for all sides of the story and have become more and more skeptical about the need, feasibility, and desirability of such studies. I am also concerned that there may be potential for serious unintended consequences of very large epidemiologic studies at this time.
I will not try to address all of these issues in this commentary and I admit from the outset that simple answers to complex issues are always somewhat suspect. Of immediate concern is what we mean by “big” or “very big” epidemiology. For the purposes of my comments, I am going to focus on studies with 100,000 plus participants. However, even that does not narrow the field enough, as we can imagine—and there already are—examples of case–control studies, cohort studies, population-based registries, collections of biologic samples on a grand scale as in BIOBANK and deCode, and other “big” designs. Thus, one cannot really assess the promises and problems associated with “big” epidemiology without being very clear about what kinds of studies we are discussing. For much of what follows I mostly focus on “big” and “very big” longitudinal studies, because they present perhaps the biggest challenges.
“Big” epidemiology should not be confused with interdisciplinarity or multidisciplinarity. Paralleling the trends in many other disciplines, many of us have found that problems of great epidemiologic and public health interest require the input and tools of multiple disciplines, regardless of study size.
I think there are some overriding problems that are raised by “big” epidemiology, and I will briefly take them up one at a time, although they are often in reality tied together.
The Motivation for Big/Very Big Epidemiology
Study size always reflects issues of statistical power related to the frequency of exposures and outcomes and effect sizes, and it thus comes as no surprise that many of the calls for B/VBE are driven by power issues. There can be no disputing the fact that the dominant current motivation for most of the current proposals for B/VBE studies in the United States (perhaps excluding the National Children's Study) and internationally is a desire to build on the advances associated with the completion of the Human Genome Project and the HapMap Project. A careful study of the design issues commissioned by the National Human Genome Research Institute and other Institutes at the National Institutes of Health concluded that a study of 200,000 to 1,000,000 participants followed over 5 years would be required to detect modest genetic main effects for common diseases and some G X G and G X E interactions. Other estimates, for example, those reported by Paul Burton and Anna Hansell at a recent INSERM meeting, indicate that in the case of the BIOBANK sample of 500,000 million Britons, age 40 to 69 years, with analyses of binary disease outcomes, binary genotypes, and a binary environmental exposure, it will take as long as 40 years to accumulate the numbers of cases for diseases such as breast cancer necessary to provide reliable effect estimates of gene–environment interaction under reasonable assumptions. It would appear that there is considerable uncertainty in many of these calculations. This is not an unusual state of affairs for power calculations, but it is one that is disquieting given the resources involved.
Limitations on What Can Be Studied Over Short Periods
Even more important is the fact that even if B/VBE studies conducted over short periods of time accrue enough outcome events, the short periods of follow-up mean that it will be only a brief portion of the disease process that is being studied. In the case of chronic degenerative diseases that take decades to evolve, what happens pathobiologically during these late stages before frank events or diagnosis is only a small part of the disease process and is potentially less important in informing preventive efforts than what happens earlier.
Although not for a moment diminishing the technical advances and intrinsic value of the new “-omics” (genomics, proteomics, phenomics, and so on), I believe an impartial observer would have to conclude that the great promissory note for their impact on population health is currently largely unfulfilled. Furthermore, they may contribute little to our understanding of major public health issues such as the obesity epidemic or the continuing existence of, and some cases increases in, important socioeconomic and racial/ethnic disparities in health.
Increased Potential for Lowest-Common-Denominator Science
Many of us have participated in the design of studies by committee. Regrettably, the final decisions are often based on the opinions of whoever hangs in through the arduous and often lengthy committee process or by those with the loudest voice or greatest authority; or, if the process goes on long enough, decisions often represent a consensus based on what can be agreed on by most of those at the table. In the case of B/VBE studies with enormous expenditure of resources, consensus is important but may end up reflecting the lowest common denominator of agreement. Thus, the creativity, exploration, and risk-taking that are more possible in smaller studies may be lost.
Trying to Study Too Much May Mean You Learn Too Little
Because of the need to justify B/VBE studies and because of the coalition-building necessary to generate the political and scientific support necessary for their funding, these studies often become extraordinarily complex with measurements proposed in many, many domains. It is fitting that studies that involve the commitment of exceptional resources represent a broad array of interests. Indeed, the National Children's Study web site (http://nationalchildrensstudy.gov/index.cfm) lists over 2400 people (by my count) that have been involved in the planning of that study (the fate of which is now uncertain). Many of these contributors are experts in their fields and, based on their knowledge and experience, they have strong opinions about what should be measured. However, resource and logistic issues invariably mean that not all of these measures can be obtained or that shortcuts need to be taken in their measurement. This means that the composition of the panels that become the arbiters of the final measurement decisions become extremely important, and one wonders how to assure the best decision-making in the face of disciplinary and political pressures as well as pressures from funding agencies and special interests. As an aside, from the perspective of a social epidemiologist, I have seen many rich opportunities for the measurement of behavioral, social, psychosocial, and socioeconomic information lost in large national studies either through the eventual elimination of such measures from studies or through their modification in ways that removes considerable information. The bottom line is that smaller and less costly studies often have the opportunity to collect more comprehensive, more precisely measured, and more state-of-the-art information, whether it be social, biologic, or environmental.
Significant Hidden Costs
It comes as no surprise that recruiting, following, and retaining the participants in B/VBE studies is difficult and expensive. These are technical issues that may be solved with resources, commitment, and ingenuity. However, are there other costs? At a time in which the National Institutes of Health budget has decreased in real terms, funding percentiles are approaching single digits at some Institutes of the National Institutes of Health, funded investigators are having grants reduced in dollar amounts, and the average age of the receipt of a first R01 is around 40 years of age, it seems unlikely that funding of B/VBE studies will not worsen the situation. With the growth of the national debt to an unprecedented level, cutbacks in many domestic programs, and the cost of the war in Iraq well over $300 billion (more than 10 times the annual National Institutes of Health budget)—and projected by the Nobel Prize-winning economist Joseph Stiglitz to go into the trillions of dollars—it seems disingenuous to think the billions of dollars projected for some B/VBE studies are likely to be appropriated, or if appropriated, that they will not result in a reduction in the funding of smaller, more focused, investigator-initiated research. These are the financial costs, but there are other potential costs as well related to “putting all the eggs in one basket.” The cornerstone of a successful national research enterprise is the ability to promote a research portfolio with diversity of interests, approaches, opportunities, and opinions. The issue of whether or not B/VBE studies reduce this diversity must be addressed.
I am not a Luddite, against the application of new technology, knowledge, or approaches in our attempts to understand the causes of the distribution of disease in populations and the application of that knowledge to improve health. However, based on the arguments presented here, and others that there is not space to elaborate, I have increasingly come to believe that considerable caution must be exercised in pursuing B/VBE studies. We may decide, like Josh in Big, that being small is just fine.
ABOUT THE AUTHOR
GEORGE A. KAPLAN is the Thomas Francis Collegiate Professor of Public Health and Professor of Epidemiology at the University of Michigan where he is also the Director of the Center for Social Epidemiology and Population Health. Professor Kaplan is a social epidemiologist whose work for the last 25 years has focused on the role of behavioral, psychosocial, and socioeconomic factors in health and health inequalities. Focusing on the role of both upstream and downstream factors, Professor Kaplan's contributions have been widely cited and he has been honored by election to the Institute of Medicine and the National Institute for Social Insurance.
The author appreciates the fruitful conversations with colleagues Mary Haan, John Lynch, Ana Diez Roux, John Frank, Michael Wolfson, Lou Kuller, and Sandro Galea and many of the attendees at the May 2006 INSERM conference on Prospects and Limitations Of Very Large Cohort Studies organized by Carole Dufouil and Annick Alperovitch. Of course, this commentary represents the author's opinions and not theirs.