Skip Navigation LinksHome > May 2001 - Volume 12 - Issue 3 > Low P-Values or Narrow Confidence Intervals: Which Are More...

Low P-Values or Narrow Confidence Intervals: Which Are More Durable?

Poole, Charles

Free Access
Article Outline
Collapse Box

Author Information

Department of Epidemiology (CB 7400), University of North Carolina School of Public Health, Chapel Hill, NC 27599-7400.

Address correspondence to: Charles Poole, Department of Epidemiology (CB 7400), University of North Carolina School of Public Health, Chapel Hill, NC 27599-7400.

Submitted and accepted January 19, 2001.

What should be the role of P-values and confidence intervals in the interpretation of scientific results? This question is not new 1 and our field of epidemiology is far from alone in struggling with it. 2,3 I have four suggestions for authors and readers. The first is quite broad, so I offer that one before describing current practices. I then turn to the other three. My remarks are confined to settings in which P-values and confidence intervals accompany estimates of effect measures, such as the relative risk.

Briefly, here are my suggestions. One, we should work harder than ever to avoid strict or exact interpretations of P-values and confidence intervals in observational research, where these statistics lack a theoretical basis. Two, we should stop interpreting P-values and confidence intervals as though they measure the probability of hypotheses. Three, when we want to know the probability of hypotheses, we should use Bayesian methods, which are designed expressly for that purpose. Four, we should get serious about precision and look for narrow confidence intervals instead of low P-values to identify results that are least influenced by random error.

Back to Top | Article Outline
Real Life Is Not Randomized

When treatment or exposure is randomized, we have a solid theoretical basis, testable in simulations, for the probability models from which P-values, confidence intervals, and likelihoods are deduced. In observational research, all we can do is hope that the social, behavioral, and physical processes by which people become exposed to risk factors in the unrandomized real world do not differ too greatly from randomization. 4 Unfortunately, each time we find that risk factors are associated with each other in observational studies, we find evidence against that hope. We cannot remind ourselves too often of this fundamental problem. At the very least, it should cause us to avoid hairsplitting interpretations of probabilistic statistics in observational research, where they are intrinsically fuzzy.

Back to Top | Article Outline
Contemporary Uses of P-Values and Confidence Intervals

Significance testing unquestionably dominates epidemiology today. In attempting to refrain from this practice over the past 17 years, 5 I have often been expected, assumed, encouraged, and sometimes even forced to engage in it by editors, reviewers, colleagues, professors, students, funding sources, regulators, attorneys, and journalists. It is not easy to be a non-tester in a testing world.

After Rothman’s highly influential 1978 essay, “A Show of Confidence,”6 an immense and easily documented shift in reporting style took place. 7 Whereas P-values or “S” (significant) and “NS” (not significant) once were reported exclusively, the reporting of confidence intervals has now become accepted practice, with or without P- value accompaniment. Confidence intervals have a survival advantage for the tiny non-testing minority to which I belong. They enable us to gauge the precision of estimates easily, but without depriving the established majority of its beloved tests.

Epidemiologists who see no purpose to a confidence interval other than its use in significance testing sometimes wonder why this shift in reporting practice has occurred. The P-value provides the information they desire more efficiently and exactly. Some are vaguely aware that confidence intervals supposedly convey information that P-values do not, but are unsure what that extra information is and even less sure how it might be useful. The word “precision” seems to be used with increasing regularity nowadays, and confidence intervals are occasionally described as “wide,” but “wide” and “imprecise” often seem nothing more than code words for “includes the null value” and hence for “not statistically significant.”

Back to Top | Article Outline
Improbable Observations Do Not Imply Improbable Hypotheses

When we estimate a parameter such as the relative risk, each possible value of that parameter is the expected value under some hypothesis, and each hypothesis has a P-value. 8,9 What we call “the”P-value is the P-value for the null hypothesis. Approximately, each P-value is the probability of obtaining an estimate at least as far from a specified value as the estimate we have obtained, if that specified value were the true value. It follows that no P-value, for the null hypothesis or any other, is the probability that the specified hypothesis is true. As an obvious example, the hypothesis corresponding to the point estimate has a (two-sided) P-value of 1.0. However, we do not treat our point estimates as absolutely certain to be true. Neither is the point estimate, in general, the most probable value.

For a given estimate, the 95% confidence interval is the set of all parameter values for which P ≥ 0.05. For the value at each limit of a 95% confidence interval, P = 0.05 (two-sided). Thus, if either of the 95% confidence limits for a relative risk estimate equals 1.0 (the null value of this parameter), we can infer that the null P-value is 0.05. From this link between confidence intervals and P-values, it follows that a 95% confidence interval is not a range of values within which the unknown true value lies with 95% probability.

The well-known “coverage probability” of confidence intervals pertains to a parameter value that is known to be true and the probability that an as yet unknown confidence interval will contain it. Coverage probability does not pertain to a known confidence interval and an unknown true value. To interpret a given 95% confidence interval as having a 95% probability of including the unknown true value is to mistake a frequentist confidence interval for a Bayesian probability interval. 10 This error is merely an extension of the logical fallacy of mistaking the null P-value for the probability that the null hypothesis is true.

Why do we turn probability logic on its head in this way? We very much want to know the probabilities of hypotheses, which require Bayesian methods to determine, but our biostatistical teachers give us the P-values and confidence intervals of frequentist statistics. We are thus led into a basic fallacy, by which the probability of A given B is mistaken for the probability of B given A. 10 A P-value of 0.04 tells us that, if the null hypothesis were true, an association at least as strong as the one we observed would occur with a probability of 4%. We find it quite natural to reverse the terms, and conclude mistakenly that the probability of the null hypothesis is 4%, given the association we observed.

The null hypothesis or any other hypothesis can be highly probable even though its P-value is less than 0.05. The null hypothesis or any other hypothesis can have a low probability even though its P- value is greater than 0.05. A relative risk can be highly probable even though it lies outside a 95% confidence interval. A relative risk can be highly improbable even though it lies inside a 95% confidence interval. The indispensable role of hypotheses in the computation of P-values and confidence intervals, with each hypothesis assigning a probability to each estimate we might possibly obtain, means that these measures are not the descriptive statistics they are sometimes said to be. 11P-values and confidence intervals are inferential statistics, but the flow of the inference is a deductive flow, in which hypotheses confer probability “down” to estimates . 12,13 Inductive statistical inference, in which the direction of the probability flow is from estimates back “up” to hypotheses, properly takes place only when prior probabilities are updated with new data, by means of Bayes’s theorem, to form posterior probabilities. 10,13

The only way we can determine the probability of the null hypothesis, or a range of values within which the true value lies with a given level of probability, is by using Bayesian methods. 10,13–15 Bayesian methods cannot be employed without the specification of prior probabilities for the hypothetical values of interest (eg, all possible values of relative risk, from zero to infinity). Since we do not specify prior probability distributions when we compute conventional (frequentist) confidence intervals, those intervals have no generally valid interpretation as Bayesian probability intervals.

Many familiar expressions - some employing probabilistic language, others avoiding it - have the effect of leading us into this misinterpretation. It has been said that being located inside a 95% confidence interval makes values plausible, probable, likely, reasonably included by the data, or even possible. Values exterior to 95% confidence intervals have been said to be implausible, improbable, unlikely, reasonably excluded by the data, or even ruled out. None of these variations on a rhetorical theme can change a simple fact of statistical life: If we want to know which values are more and less likely, more and less plausible, etc., we must specify prior probabilities for those values and use Bayes’s theorem to update those probabilities when new data are in hand.

It has become increasingly clear that the null P-value (hereafter called “the”P-value) does not do a very good job of the task for which it was originally intended: to quantify the statistical evidence against the null hypothesis. The reason is simple. The familiar Type I and Type II error rates upon which Neyman and Pearson taught us to focus 16,17 beg vitally important questions.

One minus the Type I error rate is the specificity of a significance test: the probability of not declaring “significance” when the null hypothesis is true. One minus the Type II error rate is the test’s power or sensitivity: the probability of declaring “significance” when the alternative hypothesis is true. No informed patient would be satisfied with a diagnostic test result knowing only the test’s specificity and sensitivity. That patient would want to know the test’s predictive value (positive or negative, depending on the result).

Significance tests are no different. In the same frequency terms that Neyman and Pearson used, 16,17 the researcher who wishes to be fully informed should be interested in questions such as the following: How often is the null hypothesis true when we fail to reject it? When we do reject the null hypothesis, how often is the alternative hypothesis true? These are the probabilities of ultimate concern in significance testing – the predictive values of “NS” and “S.” There is no way to determine them without postulating (stated again in frequency terms) how often the null and alternative hypotheses are true. The interest many epidemiologists express in how low the P-value is, if it is lower than 0.05, 18 raises still other questions. How much evidence against the null hypothesis do we have when P = 0.04, or when P = 0.001? To answer these questions, we need to consider the probabilities under the null and alternative hypotheses of obtaining these particular P-values, not just the probabilities of obtaining P < 0.05.

Statisticians who have examined these questions in detail 19–26 have found, under widely ranging conditions, that P-values on the order of 0.05, 0.01, and even lower provide much less evidence against the null hypothesis than they appear to provide at face value. As a general matter, P- values in the vicinity of 0.05 provide almost no evidence against the null hypothesis at all. P = 0.04, for instance, is typically found to be almost equally probable under the null and alternative hypotheses.

One upshot of this work has been a statistical research program devoted to calibrating, standardizing, conditioning, or adjusting low P-values to make them higher, so that they reflect more realistically the limited statistical evidence they provide against the null hypothesis. 19–26 Now that Bayesian methods are computationally feasible, one wonders whether these efforts to patch up P-values will ultimately be viewed a transitional stopgap.

Back to Top | Article Outline
Taking Precision Seriously

Transitional stopgaps should not be dismissed lightly, especially when the transitions in question take decades to unfold. Stopgaps can be particularly valuable when it seems that the only alternative is to cry in the (frequentist) wilderness for a (Bayesian) revolution. In epidemiology, the advent of confidence intervals creates an opportunity to take another small step toward more widespread use of Bayesian methods, while at the same time improving overall interpretation. This step is merely to take precision seriously.

Epidemiologists have many reasons to emphasize certain results over others. Some results may pertain to particularly topical research questions. Some may be more valid than others. And some may be less influenced by random error. This last consideration seems to be an important one to many epidemiologists, who regularly use P-values to determine the degree to which chance influences their results. They believe that the lower the P-value, the less the influence of chance. Unfortunately, this extremely common use of the P-value is a misuse and an abuse of that statistic. The estimates least influenced by chance are not those with low P-values, but those with narrow confidence intervals.

Consider the four hypothetical relative risk estimates in Table 1. The ratio of the upper to lower 95% confidence limits (CLR) is a handy measure of confidence interval width, and thus of precision. (For a difference measure such as the risk difference, the difference between the upper and lower confidence limits would serve the same purpose.) The example was devised to dramatize four clear-cut combinations of statistical “significance” and precision.

Table 1
Table 1
Image Tools

To the extent that the role of chance would be taken into account in deciding which of these results to emphasize, the conventional choices would be the statistically “significant” estimates B and C. These would be the “associations unlikely to be due to chance alone.” But one of them, estimate C, is very unstable. That estimate is influenced much more by random error, and from that standpoint is much less dependable, than estimate B.

Of equal importance, when C is compared with D, estimate C is influenced much more by chance and in that regard is much less trustworthy, even though estimate C is statistically “significant” and estimate D is not. Estimates B and D – not B and C – are this study’s most precise estimates. Estimates B and D stand the best chance of holding up, conditional on their validity, in the context of existing and future research. Estimates B and D would weigh more heavily into meta-analyses and would exert stronger influences on probability distributions in properly conducted Bayesian analyses. Estimates B and D are the results that should be put forth for emphasis as the most statistically stable results this study has to offer.

It is sometimes said that confidence intervals are especially valuable, and that increases in sample size and statistical efficiency are particularly needed, when statistical “significance” has not been attained. To the contrary, an estimate that has a wide confidence interval is imprecise and unstable no matter how low its P-value. Based solely on the results in Table 1, larger sample sizes, special study populations and statistically more efficient designs would be particularly desirable for A and C, regardless of the fact that one of these estimates is statistically “significant” and the other is not.

Some epidemiologists wonder what all the fuss over P-values and confidence intervals is about. This hypothetical example shows how an emphasis on precision rather than statistical “significance” can affect which results we may choose to highlight. I invite the reader to examine published research reports in which the estimates with the lowest P-values have been singled out for emphasis, and to imagine how differently those papers would read if the estimates with the narrowest confidence intervals had been highlighted instead.

Back to Top | Article Outline


Our results that deserve the greatest reliance are those that are most stable and trustworthy. With regard to random error, a very poor way of identifying dependable results is to select associations with impressively low P-values. Inference and decision-making would be far better served by choosing estimates with narrow confidence intervals, which are least vulnerable to the play of chance. These are the results for which, by virtue of intentional or accidental features of our research methods, our studies provide the most evidence (as distinguished from the most valid evidence).

By taking precision seriously, we can easily identify those research questions on which our studies provide the greatest quantity of statistical evidence, and those questions for which larger and more statistically efficient studies are needed. In terms of resistance to random error, our most durable results are our most precise estimates - however unspectacular, unsensational, and “non-significant” many of those estimates might be.

Back to Top | Article Outline


1. Berkson J. Some difficulties of interpretation encountered in the application of the chi-squared test. J Am Stat Assoc 1938; 33: 526–536.

2. Anderson DR, Burnham KP, Thompson WL. Null hypothesis testing: problems, prevalence, and an alternative. J Wildlife Management 2000; 64: 912–923.

3. Walter SD. Methods of reporting statistical results from medical research studies. Am J Epidemiol 1995; 141: 896–906.

4. Greenland S. Randomization, statistics, and causal inference. Epidemiology 1990; 1: 421–429.

5. Poole C, Lanes SF, Rothman KJ. Analyzing data from ordered categories (letter). New Engl J Med 1984; 311: 1382.

6. Rothman KJ. A show of confidence. N Engl J Med 1978; 299: 1362–1363.

7. Savitz D, Tolo K-A, Poole C. Statistical significance testing in the American Journal of Epidemiology 1970 to 1990. Am J Epidemiol 1994; 139: 1047–1052.

8. Poole C. Beyond the confidence interval. Am J Public Health 1987; 77: 195–199.

9. Poole C. Confidence intervals exclude nothing. Am J Public Health 1987; 77: 492–493.

10. Lindley DV. The philosophy of statistics (with discussion). The Statistician 2000; 49: 293–337.

11. Savitz DA, Olshan AF. Describing data requires no adjustment for multiple comparisons: a reply from Savitz and Olshan. Am J Epidemiol 1998; 147: 813–814.

12. Poole C. Induction does not exist in epidemiology, either. In: Rothman KJ (ed), Causal Inference. Chestnut Hill, MA: Epidemiology Resources Inc., 1988: 153–162.

13. Greenland S. Probability logic and probabilistic induction. Epidemiology 1998; 9: 322–332.

14. Gelman AB, Carlin JS, Stern HS, Rubin DB. Bayesian data analysis. Boca Raton, FL: Chapman & Hall/CRC, 1995; 42–45.

15. Berry DA. Statistics. a Bayesian perspective. Belmont, CA: Duxbury Press, 1996; 147–161.

16. Neyman J, Pearson E. On the use and interpretation of certain test criteria for purposes of statistical inference. Biometrika 1928; 20: 175–240.

17. Neyman J, Pearson E. On the problem of the most efficient tests of statistical hypotheses. Philos Trans R Soc Lond A 1933; 231: 289–337.

18. Goodman SN. p values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate (with discussion). Am J Epidemiol 1993; 137: 485–501.

19. Casella G, Berger R. Reconciling Bayesian and frequentist evidence in the one-sided testing problem (with discussion). J Am Stat Assoc 1987; 82: 106–111.

20. Berger JO, Selke T. Testing a point null hypothesis: the irreconcilability of P- values and evidence. J Am Stat Assoc 1987; 82: 112–122.

21. Berger JO, Delampady M. Testing precise hypotheses (with discussion). Stat Sci 1987; 3: 317–352.

22. Delampady M, Berger JO. Lower bounds on Bayes factors for the multinomial distribution, with application to chi-squared tests of fit. Ann Stat 1990; 18: 1295–1316.

23. Berger J, Boukai B, Wang Y. Unified frequentist and Bayesian testing of a precise hypothesis (with discussion). Stat Sci 1997; 12: 133–160.

24. Selke T, Bayarri MJ, Berger J. Calibration of P- values for testing precise null hypotheses. Am Stat 2001; 55: 62-71.

25. Goodman SN. Towards evidence-based medical statistics. I. The P- value fallacy. Ann Intern Med 1999; 130: 995–1004.

26. Goodman SN. Towards evidence-based medical statistics. II. The Bayes factor. Ann Intern Med 1999; 130: 1005–1013.

Cited By:

This article has been cited 76 time(s).

Journal De Gynecologie Obstetrique Et Biologie De La Reproduction
Impact of chemical and physical environmental factors on the course and outcome of pregnancy
Slama, R; Cordier, S
Journal De Gynecologie Obstetrique Et Biologie De La Reproduction, 42(5): 413-444.
American Scientist
Do We Really Need the S-word?
Higgs, MD
American Scientist, 101(1): 6-9.

Accident Analysis and Prevention
Graduated driver licensing program component calibrations and their association with fatal crash involvement
Masten, SV; Foss, RD; Marshall, SW
Accident Analysis and Prevention, 57(): 105-113.
Journal of Business Research
From significant difference to significant sameness: Proposing a paradigm shift in business research
Hubbard, R; Lindsay, RM
Journal of Business Research, 66(9): 1377-1388.
New England Journal of Medicine
A randomized trial comparing radical prostatectomy with watchful waiting in early prostate cancer
Holmberg, L; Bill-Axelson, A; Helgesen, F; Salo, JO; Folmerz, P; Haggman, M; Andersson, S; Spangberg, A; Busch, C; Nordling, S; Palmgren, J; Adami, HO; Johansson, J; Norlen, BJ
New England Journal of Medicine, 347(): 781-789.

Nutrition and Cancer-An International Journal
Does physical activity modify the association between body mass index and colorectal adenomas?
Guilera, M; Connelly-Frost, A; Keku, TO; Martin, CF; Galanko, J; Sandier, RS
Nutrition and Cancer-An International Journal, 51(2): 140-145.

Archives of Neurology
Apolipoprotein e and dementia in Parkinson disease - A meta-analysis
Huang, XM; Chen, P; Kaufer, DI; Troster, AI; Poole, C
Archives of Neurology, 63(2): 189-193.

Plos Medicine
Selection in reported epidemiological risks: An empirical assessment
Kavvoura, FK; Liberopoulos, G; Ioannidis, JPA
Plos Medicine, 4(3): 456-465.
ARTN e79
American Journal of Industrial Medicine
Ergonomic Risk Factors for Low Back Pain in North Carolina Crab Pot and Gill Net Commercial Fishermen
Kucera, KL; Loomis, D; Lipscomb, HJ; Marshall, SW; Mirka, GA; Daniels, JL
American Journal of Industrial Medicine, 52(4): 311-321.
Journal of Health Communication
Knowledge of Human Papillomavirus: Differences by Self-Reported Treatment for Genital Warts and Sociodemographic Characteristics
Koshiol, J; Rutten, LF; Moser, RP; Hesse, N
Journal of Health Communication, 14(4): 331-345.
British Journal of Clinical Pharmacology
Confidence intervals, misclassification of exposure and risk of suicide in users of beta-adrenoceptor blockers: a reply
Sorensen, HT; Mellemkjaer, L; Olsen, JH
British Journal of Clinical Pharmacology, 53(4): 407-408.

Fertility and Sterility
The randomized world is not without its imperfections: reflections on the Women's Health Initiative Study
McDonough, PG
Fertility and Sterility, 78(5): 951-956.
PII S0015-0282(02)04403-5
International Journal of Epidemiology
Commentary: This study failed?
Poole, C; Peters, U; Il'yasova, D; Arab, L
International Journal of Epidemiology, 32(4): 534-535.
Epidemiology and Infection
Analysis of the FoodNet case-control study of sporadic Salmonella serotype Enteritidis infections using persons infected with other Salmonella serotypes as the comparison group
Voetsch, AC; Poole, C; Hedberg, CW; Hoekstra, RM; Ryder, RW; Weber, DJ; Angulo, FJ
Epidemiology and Infection, 137(3): 408-416.
Regulatory Toxicology and Pharmacology
Statistical power in the analyses of brain weight measures in pesticide neurotoxicity testing and the relationship between brain and body weight
Weichenthal, S; Hancock, S; Raffaele, K
Regulatory Toxicology and Pharmacology, 57(): 235-240.
Marine Ecology-Progress Series
Assessing impacts of dredge spoil disposal using equivalence tests: implications of a precautionary (proof of safety) approach
Cole, R; McBride, G
Marine Ecology-Progress Series, 279(): 63-72.

International Journal of Epidemiology
A method to automate probabilistic sensitivity analyses of misclassified binary variables
Fox, MP; Lash, TL; Greenland, S
International Journal of Epidemiology, 34(6): 1370-1376.
American Journal of Gastroenterology
Irritable Bowel syndrome: A co-twin control analysis
Wojczynski, MK; North, KE; Pedersen, NL; Sullivan, PF
American Journal of Gastroenterology, 102(): 2220-2229.
Jama-Journal of the American Medical Association
Genetic susceptibility to cancer - The role of polymorphisms in candidate genes
Dong, LM; Potter, JD; White, E; Ulrich, CM; Cardon, LR; Peters, U
Jama-Journal of the American Medical Association, 299(): 2423-2436.

Association Between p53 codon 72 Genetic Polymorphism and Tobacco Use and Lung Cancer Risk
Caceres, DD; Quinones, LA; Schroeder, JC; Gil, LD; Irarrazabal, CE
Lung, 187(2): 110-115.
Otjr-Occupation Participation and Health
Confidence Levels Can Broaden the Application of Clinical Research Findings and Promote Evidence-Based Practice
Graham, JE; Reistetter, TA; Mallinson, TR; Ottenbacher, KJ
Otjr-Occupation Participation and Health, 29(3): 99-104.
International Journal of Epidemiology
Response: Bayesian perspectives for epidemiological research
Greenland, S
International Journal of Epidemiology, 35(3): 777-778.
Archives of Pediatrics & Adolescent Medicine
Reporting statistical information in medical journal articles
Cummings, P; Rivara, FP
Archives of Pediatrics & Adolescent Medicine, 157(4): 321-324.

Journal of Cataract and Refractive Surgery
Beyond the P - I: Problems with probability
Wilhelmus, KR
Journal of Cataract and Refractive Surgery, 30(9): 2005-2006.
American Journal of Epidemiology
Effect of deployment on the occurrence of child maltreatment in military and nonmilitary families
Rentz, ED; Marshall, SW; Loomis, D; Casteel, C; Martin, SL; Gibbs, DA
American Journal of Epidemiology, 165(): 1199-1206.
Journal of Athletic Training
Concentric and Eccentric Torque of the Hip Musculature in Individuals With and Without Patellofemoral Pain
Boling, MC; Padua, DA; Creighton, RA
Journal of Athletic Training, 44(1): 7-13.

Annals of Emergency Medicine
Reporting research results: Recommendations for improving communication
Cooper, RJ; Wears, RL; Schriger, DL
Annals of Emergency Medicine, 41(4): 561-564.
Occupational and Environmental Medicine
Literature review of cancer mortality and incidence among dentists
Simning, A; van Wijngaarden, E
Occupational and Environmental Medicine, 64(7): 432-438.
Injury Prevention
Hospitalized head injuries among older people in Australia, 1998/1999 to 2004/2005
Jamieson, LM; Roberts-Thomson, KF
Injury Prevention, 13(4): 243-247.
Oral Diseases
Design and statistical analysis of oral medicine studies: common pitfalls
Baccaglini, L; Shuster, JJ; Cheng, J; Theriaque, DW; Schoenbach, VJ; Tomar, SL; Poole, C
Oral Diseases, 16(3): 233-241.
Cancer Causes & Control
Occupation and the risk of adult glioma in the United States
De Roos, AJ; Stewart, PA; Linet, MS; Heineman, EF; Dosemeci, M; Wilcosky, T; Shapiro, WR; Selker, RG; Fine, HA; Black, PM; Inskip, PD
Cancer Causes & Control, 14(2): 139-150.

Proceedings of the Royal Society of London Series B-Biological Sciences
In support of null hypothesis significance testing
Mogie, M
Proceedings of the Royal Society of London Series B-Biological Sciences, 271(): S82-S84.
Cancer Causes & Control
Choice of exposure scores for categorical regression in meta-analysis: a case study of a common problem
Il'yasova, D; Hertz-Picciotto, I; Peters, U; Berlin, JA; Poole, C
Cancer Causes & Control, 16(4): 383-388.
Journal of Science and Medicine in Sport
Testing with confidence: The use (and misuse) of confidence intervals in biomedical research
Marshall, SW
Journal of Science and Medicine in Sport, 7(2): 135-137.

Oral Surgery Oral Medicine Oral Pathology Oral Radiology and Endodontics
MTA and IRM as root-end filling materials in endodontic surgery
Dietrich, T
Oral Surgery Oral Medicine Oral Pathology Oral Radiology and Endodontics, 102(2): 143-144.
Occupational and Environmental Medicine
A case crossover study of triggers for hand injuries in commercial fishing
Kucera, KL; Loomis, D; Marshall, SW
Occupational and Environmental Medicine, 65(5): 336-341.
Emerging Infectious Diseases
Sulfa use, dihydropteroate synthase mutations, and Pneumocystis jirovecii pneumonia
Stein, CR; Poole, C; Kazanjian, P; Meshnick, SR
Emerging Infectious Diseases, 10(): 1760-1765.

British Journal of Sports Medicine
Injury history as a risk factor for incident injury in youth soccer
Kucera, KL; Marshall, SW; Kirkendall, DT; Marchak, PM; Garrett, WE
British Journal of Sports Medicine, 39(7): 462-466.
Journal of Science and Medicine in Sport
Locke, S
Journal of Science and Medicine in Sport, 9(): 190-191.
Environmental Health Perspectives
Evaluation of serum immunoglobulins among individuals living near six superfund sites
Williamson, DM; White, MC; Poole, C; Kleinbaum, D; Vogt, R; North, K
Environmental Health Perspectives, 114(7): 1065-1071.
Journal of Athletic Training
Athlete characteristics and outcome scores for computerized neuropsychological assessment: A preliminary analysis
Brown, CN; Guskiewicz, KM; Bleiberg, J
Journal of Athletic Training, 42(4): 515-523.

Natural Areas Journal
Assessing the reliability of ecological monitoring data: Power analysis and alternative approaches
Morrison, LW
Natural Areas Journal, 27(1): 83-91.

Theoretical Medicine and Bioethics
A problem for achieving informed choice
La Caze, A
Theoretical Medicine and Bioethics, 29(4): 255-265.
Environmental Pollution
Cumulative effects and threshold levels in air pollution mortality: Data analysis of nine large US cities using the NMMAPS dataset
Stylianou, M; Nicolich, MJ
Environmental Pollution, 157(): 2216-2223.
Statistical significance tests: Equivalence and reverse tests should reduce misinterpretation
Parkhurst, DF
Bioscience, 51(): 1051-1057.

Journal of Cataract and Refractive Surgery
Beyond the P IV: Gain confidence in confidence intervals
Wilhelmus, KR
Journal of Cataract and Refractive Surgery, 30(): 2618-2619.
International Journal of Epidemiology
Commentary: Vitamin supplement use and confounding by lifestyle
Hoggatt, KJ
International Journal of Epidemiology, 32(4): 553-555.
American Journal of Industrial Medicine
Upper-Extremity Musculoskeletal Symptoms and Physical Health Related Quality of Life Among Women Employed in Poultry Processing and Other Low-Wage jobs in Northeastern North Carolina
McPhee, CS; Lipscomb, HJ
American Journal of Industrial Medicine, 52(4): 331-340.
European Journal of Epidemiology
The ongoing tyranny of statistical significance testing in biomedical research
Stang, A; Poole, C; Kuss, O
European Journal of Epidemiology, 25(4): 225-230.
New England Journal of Medicine
Deployment and the Use of Mental Health Services among US Army Wives
Mansfield, AJ; Kaufman, JS; Marshall, SW; Gaynes, BN; Morrissey, JP; Engel, CC
New England Journal of Medicine, 362(2): 101-109.

American Journal of Industrial Medicine
Predictors of Delayed Return to Work After Back Injury: A Case-Control Analysis of Union Carpenters in Washington State
Kucera, KL; Lipscomb, HJ; Silverstein, B; Cameron, W
American Journal of Industrial Medicine, 52(): 821-830.
Environmental Health Perspectives
Estimating Error in Using Residential Outdoor PM2.5 Concentrations as Proxies for Personal Exposures: A Meta-analysis
Avery, CL; Mills, KT; Williams, R; McGraw, KA; Poole, C; Smith, RL; Whitsel, EA
Environmental Health Perspectives, 118(5): 673-678.
American Journal of Gastroenterology
Constipation, laxative use, and colon cancer in a North Carolina population
Roberts, MC; Millikan, RC; Galanko, JA; Martin, C; Sandler, RS
American Journal of Gastroenterology, 98(4): 857-864.
Otolaryngology-Head and Neck Surgery
How to review journal manuscripts
Rosenfeld, RM
Otolaryngology-Head and Neck Surgery, 142(4): 472-486.
Luteinizing hormone beta polymorphism and risk of familial and sporadic prostate cancer
Elkins, DA; Yokomizo, A; Thibodeau, SN; Schaid, DJ; Cunningham, JM; Marks, A; Christensen, E; McDonnell, SK; Slager, S; Peterson, BJ; Jacobsen, SJ; Cerhan, JR; Blute, ML; Tindall, DJ; Liu, WG
Prostate, 56(1): 30-36.
Archives of Pediatrics & Adolescent Medicine
Writing informative abstracts for journal articles
Cummings, P; Rivara, FP; Koepsell, TD
Archives of Pediatrics & Adolescent Medicine, 158(): 1086-1088.

International Journal of Epidemiology
Interval estimation by simulation as an alternative to and extension of confidence intervals
Greenland, S
International Journal of Epidemiology, 33(6): 1389-1397.
American Journal of Community Psychology
Area-Based Socioeconomic Characteristics of Industries at High Risk for Violence in the Workplace
Ta, ML; Marshall, SW; Kaufman, JS; Loomis, D; Casteel, C; Land, KC
American Journal of Community Psychology, 44(): 249-260.
Occupational and Environmental Medicine
Age at exposure to ionising radiation and cancer mortality among Hanford workers: follow up through 1994
Wing, S; Richardson, DB
Occupational and Environmental Medicine, 62(7): 465-472.
Sports Medicine
Mouthguards in sport activities history, physical properties and injury prevention effectiveness
Knapik, JJ; Marshall, SW; Lee, RB; Darakjy, SS; Jones, SB; Mitchener, TA; Delacruz, GG; Jones, BH
Sports Medicine, 37(2): 117-144.

Basic & Clinical Pharmacology & Toxicology
Establishing evidence for early action: The prevention of reproductive and developmental harm
Gee, D
Basic & Clinical Pharmacology & Toxicology, 102(2): 257-266.
American Journal of Sports Medicine
A Prospective Investigation of Biomechanical Risk Factors for Patellofemoral Pain Syndrome The Joint Undertaking to Monitor and Prevent ACL Injury (JUMP-ACL) Cohort
Boling, MC; Padua, DA; Marshall, SW; Guskiewicz, K; Pyne, S; Beutler, A
American Journal of Sports Medicine, 37(): 2108-2116.
American Journal of Tropical Medicine and Hygiene
PFMDR1 and in vivo resistance to artesunate-mefloquine in falciparum malaria on the Cambodian-Thai border
Alker, AP; Lim, P; Sem, R; Shah, NK; Yi, P; Bouth, DM; Tsuyuoka, R; Maguire, JD; Fandeur, T; Ariey, F; Wongsrichanalai, C; Meshnick, SR
American Journal of Tropical Medicine and Hygiene, 76(4): 641-647.

Journal of Toxicology and Environmental Health-Part B-Critical Reviews
Pesticide Exposure and Neurodevelopmental Outcomes: Review of the Epidemiologic and Animal Studies
Burns, CJ; McIntosh, LJ; Mink, PJ; Jurek, AM; Li, AA
Journal of Toxicology and Environmental Health-Part B-Critical Reviews, 16(): 127-283.
Bmc Health Services Research
Understanding and benchmarking health service achievement of policy goals for chronic disease
Bell, E; Seidel, B
Bmc Health Services Research, 12(): -.
ARTN 343
Statistical Methods in Medical Research
Performance of a semi-automated approach for risk estimation using a common data model for longitudinal healthcare databases
Le, HV; Beach, KJ; Powell, G; Pattishall, E; Ryan, P; Mera, RM
Statistical Methods in Medical Research, 22(1): 97-112.
Journal of Experimental & Theoretical Artificial Intelligence
Significance tests or confidence intervals: which are preferable for the comparison of classifiers?
Berrar, D; Lozano, JA
Journal of Experimental & Theoretical Artificial Intelligence, 25(2): 189-206.
Bmc Public Health
The evidence-policy divide: a 'critical computational linguistics' approach to the language of 18 health agency CEOs from 9 countries
Bell, E; Seidel, BM
Bmc Public Health, 12(): -.
ARTN 932
Clinical Journal of Sport Medicine
Risk Factors and Risk Statistics for Sports Injuries
Hopkins, WG; Marshall, SW; Quarrie, KL; Hume, PA
Clinical Journal of Sport Medicine, 17(3): 208-210.
PDF (51) | CrossRef
Intake of Vitamin C and Zinc and Risk of Common Cold: A Cohort Study
Takkouche, B; Regueira-Méndez, C; García-Closas, R; Figueiras, A; Gestal-Otero, JJ
Epidemiology, 13(1): 38-44.

PDF (83)
Heuristic Thinking and Inference From Observational Epidemiology
Lash, TL
Epidemiology, 18(1): 67-72.
PDF (226) | CrossRef
The Value of P
The Editors,
Epidemiology, 12(3): 286.

Semi-Automated Sensitivity Analysis to Assess Systematic Errors in Observational Data
Lash, TL; Fink, AK
Epidemiology, 14(4): 451-458.
PDF (330) | CrossRef
The Value of Risk-Factor (“Black-Box”) Epidemiology
Greenland, S; Gago-Dominguez, M; Castelao, JE
Epidemiology, 15(5): 529-535.
PDF (264) | CrossRef
Some Guidelines on Guidelines: They Should Come With Expiration Dates
Rothman, KJ; Poole, C
Epidemiology, 18(6): 794-796.
PDF (126) | CrossRef
Genetics in Medicine
Angiotensin II type 1 receptor polymorphisms and susceptibility to hypertension: A HuGE review
Mottl, AK; Shoham, DA; North, KE
Genetics in Medicine, 10(8): 560-574.
PDF (660) | CrossRef
Back to Top | Article Outline

© 2001 Lippincott Williams & Wilkins, Inc.

Twitter  Facebook 


Article Tools



Article Level Metrics