Secondary Logo

Journal Logo

Put the Kibosh On Bias

Hartung, John PhD

Journal of Neurosurgical Anesthesiology: October 2019 - Volume 31 - Issue 4 - p 359–360
doi: 10.1097/ANA.0000000000000634

Department of Anesthesiology, State University of New York Downstate Health Sciences University, Brooklyn, NY

The author has no funding or conflicts of interest to disclose.

This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY-NC-ND), where it is permissible to download and share the work provided it is properly cited. The work cannot be changed in any way or used commercially without permission from the journal.

The Bureau of Labor Statistics counted 30,590 anesthesiologists working in the United States during 2017,1 and 38,600 surgeons2 (data current until mid-2020 publication of results for 2018). The Institute for Scientific Information reported 31 journals categorized under Anesthesiology in 2017 with a median Impact Factor of 2.56, compared with 200 journals categorized under Surgery with a median Impact Factor of 1.81.3 Adjusted for number of practitioners, surgeons had five times more journals than anesthesiologists, with an Impact Factor diluted by 30%.

In 1988, I asked Mary Rogers, then Acquisitions Editor for Raven Press (JNA’s original publisher) and subsequently Chief Executive of Wolters Kluwer Healthcare Publishers (JNA’s current publisher), why scientists submit the proceeds of their labor—their manuscripts—to medical journals for free (unlike “trade” journal authors, who receive payment). Mary looked down for a moment before raising her head to give a one-word answer: “Ego.” Surgeons are not known for having underdeveloped egos, so I take the statistics above as support for Mary Rogers’ insight.

Mary knew what had taken me years to learn as my department’s in-house statistician. Yes, research is driven by ambition for status and money, but it is also driven by colleagues who are inspired by an answer, often their own answer, to a pressing question—an answer that they believe with conviction, if not vehemence, prior to designing a research protocol. Real scientists devise ways to stress-test their hypotheses, but pseudo-scientists use their considerable aptitude to “prove” their hypotheses. In consequence, much of what passes for empiricism is driven by answers instead of being driven by questions.

John Ioannidis, perhaps the world’s most insightful analyst of statistical analyses, has estimated that up to 90% of medical research is seriously flawed because bias in favor of preferred results “is unavoidable and people should take that for granted when they … read other scientists’ work.”4–6 So, if bias is an intractable aspect of human nature, but Charles Darwin was right about another aspect of human nature—that “… false views, if supported by some evidence, do little harm, for everyone takes a salutary pleasure in proving their falseness”7—why is the peer-review process so ineffective?8,9

Twenty-nine years as the Associate Editor of JNA (1989-2017) taught me that the review process is, at its base, vastly more effective than it appears to be. The integrity and commitment of reviewers who provide expert commentary is well known to editors and authors.10 However, reviewers’ assessments are based on the manuscripts presented to them, and they accept an abundance of flawed research for publication because so many manuscripts are clever revisions of prior versions that were rejected by their authors’ first-choice journal before being accepted by their authors’ second or third-choice journal … or, if the ratio of authors to subject-appropriate journals is high enough, by their fourth or fifth-choice journal. This flaw in the peer-review process could be remedied by a mechanical fix.

The opportunity to finesse methodological shortcomings, eliminate secondary results that invite questions about primary outcomes, develop new primary outcomes, and change results that raised reviewers’ eyebrows in the first submission of a manuscript, would be nearly impossible if every manuscript was forwarded to every journal through a central clearing house—a clearing house that would forward each submission’s tracked history, including all prior versions, all reviews of all prior versions, and all prior editorial correspondence with each submission of each manuscript—all without making any evaluation of manuscripts’ quality.

That could be accomplished if the clearing house retained a digital copy of every submission and used modified plagiarism software to compare each manuscript to every other manuscript that has been submitted through the clearing house—to determine whether a manuscript is what its authors purport it to be, that is, a first submission of a new manuscript, a revision of a manuscript being sent to the same journal, or a revision of a manuscript being sent to a different journal—as distinct from an attempt to submit a revision of a rejected manuscript to a different journal without including its history. If editors and reviewers received every submission’s complete history, they could distinguish between a manuscript that should be rejected for the same reasons that it was rejected before, versus a manuscript that should be considered because it received a boilerplate rejection letter from a higher status journal—suggesting that the editor of that journal felt the manuscript would not receive enough citations to support that journal’s higher Impact Factor.

The key here would be encouraging journals to require authors to submit manuscripts through the clearing house and agree to send reviews and editorial correspondence back to authors through the clearing house. As with the adoption of, major journals would readily agree (BMJ, NEJM, Lancet, JAMA, etc.), which would increase pressure on second-tier journals, and so on down the line. A few years after initiation, an incentive could be added regarding the speed of indexing at PubMed, with articles published in clearing house journals receiving priority. After a substantive majority of journals join the clearing house procedure, the Institute for Scientific Information could notify journals that bibliometrics, like Impact Factors, would only be calculated for clearing house journals. Eventually, NIH/NSF funding could require authors to submit manuscripts that report funded results exclusively to journals that require submission through the clearing house. Ultimately, journals that operate outside of the clearing house would not be indexed in PubMed.

The immediate effect on investigators would be to cause them to invest more effort and thought into research design, because covering up design inadequacies in “doctored” revisions would become exceedingly difficult. The effect on publishers would be shrinkage of their journal portfolios, because authors’ practice of starting near the top and working their way down the Impact Factor scale, such that some version of almost every manuscript eventually gets published, would be greatly curtailed. Indeed, instead of 5,500-plus indexed medical journals, we might end up with fewer than 2000. That would be a big payoff for science, not just because fewer journals would lead to more efficient literature searches, but because the literature searched would contain far less fake-science.

Unlike current schemes for improving the quality of scientific literature (eg, trial registration, preregistration,11 preprints, open peer review), the power of the clearing house would derive from what it does not do and what it does not depend upon. It would not depend upon the integrity of investigators, authors, editors, reviewers, nor publishers, and it would not pass judgment on the scientific merit of the manuscripts examined. It would be an essentially mechanical process that would, like plagiarism software, count the amount of overlap (identical sentences, identical paragraphs, identical authors) between previous submissions and new submissions. Perhaps best operated by the National Library of Medicine in conjunction with Thompson Reuters Institute for Scientific Information, clearing house personnel would only need to intervene when a computer-flagged manuscript contained a conspicuous level of overlap. In an ever diminishing number of cases, manuscripts flagged for human evaluation of overlap would need to be sent back to authors for explanations before being forwarded to their intended journal for editorial consideration.

Science progresses at an impressive rate despite the adage “Odds Are, It’s Wrong.”12 Imagine the rate of progress that would follow from reversing the ratio of good/bad science from 25/75 to 75/25!

John Hartung, PhD

Department of Anesthesiology, State University of New York Downstate Health Sciences University, Brooklyn, NY

Back to Top | Article Outline


1. Bureau of Labor Statistics. Available at: Accessed June 12, 2019.
2. Bureau of Labor Statistics. Available at: Accessed June 12, 2019.
3. Clarivate Analytics: InCites Journal Citation Reports. Available at: Accessed June 12, 2019.
4. Ioannidis JP. Contradicted and initially stronger effects in highly cited clinical research. JAMA. 2005;294:218–228.
5. Ioannidis JP. Why most published research findings are false. PLoS Med. 2005;2:e124.
6. Ioannidis JP. Most research is flawed: let’s fix it. Medscape. 2019. Available at: Accessed June 14, 2019.
7. Darwin C. The Descent of Man. London: John Murray; 1879.
8. Murphy BD. Why scientific peer review is a sham. Waking Times. 2018. Available at: Accessed June 14, 2019.
9. Krumholz HM. The end of journals. Circ Cardiovasc Qual Outcomes. 2015;8:533–534.
10. Smith M. JNA Editorial—July 2018. J Neurosurg Anesthesiol. 2018;30:199.
11. Adam D. Psychology’s reproducibility solution fails first test. Science. 2019;364:813.
12. Siegfried T. Odds are, it’s wrong. ScienceNews. 2010. Available at: Accessed June 18, 2019.
Copyright © 2019 Wolters Kluwer Health, Inc. All rights reserved