Bernstam, Elmer V. MD, MSE; Hersh, William R. MD; Johnson, Stephen B. PhD; Chute, Christopher G. MD, DrPh; Nguyen, Hien MD, MAS; Sim, Ida MD, PhD; Nahm, Meredith MS; Weiner, Mark G. MD; Miller, Perry MD, PhD; DiLaura, Robert P. DBA, MBA; Overcash, Marc; Lehmann, Harold P. MD, PhD; Eichmann, David PhD; Athey, Brian D. PhD; Scheuermann, Richard H. PhD; Anderson, Nick PhD; Starren, Justin MD, PhD; Harris, Paul A. PhD; Smith, Jack W. MD, PhD; Barbour, Ed MS; Silverstein, Jonathan C. MD, MS; Krusch, David A. MD; Nagarajan, Rakesh MD, PhD; Becich, Michael J. MD, PhD; on behalf of the CTSA Biomedical Informatics Key Function Committee
Dr. Bernstam is associate professor of health information sciences and internal medicine, University of Texas Health Science Center at Houston, Houston, Texas.
Dr. Hersh is professor and chair, Department of Medical Informatics & Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon.
Dr. Johnson is associate professor of biomedical informatics, Columbia University, New York, New York.
Dr. Chute is chair, Division of Biomedical Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota.
Dr. Nguyen is assistant professor of infectious diseases, Department of Internal Medicine, University of California, Davis, Davis, California.
Dr. Sim is associate professor of general internal medicine, University of California, San Francisco, San Francisco, California.
Ms. Nahm is associate director, Biomedical Clinical Research Informatics Core, Duke Translational Medicine Institute, Durham, North Carolina.
Dr. Weiner is associate professor of medicine, University of Pennsylvania, Philadelphia, Pennsylvania.
Dr. Miller is professor, Center for Medical Informatics, Yale University School of Medicine, New Haven, Connecticut.
Dr. DiLaura is head, Section of Research Informatics, Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, Ohio.
Mr. Overcash is chief information officer and director of health sciences and research, Emory University, Atlanta, Georgia.
Dr. Lehmann is associate professor of health sciences informatics, Johns Hopkins University, Baltimore, Maryland.
Dr. Eichmann is associate professor of library and information science, University of Iowa, Iowa City, Iowa.
Dr. Athey is professor of psychiatry, University of Michigan Medical School, Ann Arbor, Michigan.
Dr. Scheuermann is professor of pathology and chief of biomedical informatics, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas.
Dr. Anderson is acting assistant professor of biomedical and health informatics, University of Washington, Seattle, Washington.
Dr. Starren is director, Biomedical Informatics Research Center, Marshfield Clinic, Marshfield, Wisconsin.
Dr. Harris is research associate professor of biomedical informatics and biomedical engineering, Vanderbilt University, Nashville, Tennessee.
Dr. Smith is professor and dean, School of Health Information Sciences, University of Texas Health Science Center at Houston, Houston, Texas.
Mr. Barbour is a manager, Hospital Informatics Core, Rockefeller University, New York, New York.
Dr. Silverstein is associate professor of surgery and radiology, Computation Institute, University of Chicago, Chicago, Illinois.
Dr. Krusch is associate professor of medical informatics, University of Rochester School of Medicine and Dentistry, Rochester, New York.
Dr. Nagarajan is assistant professor of clinical pathology, Washington University School of Medicine, St. Louis, Missouri.
Dr. Becich is professor and chair, Department of Biomedical Informatics, University of Pittsburgh Medical School, Pittsburgh, Pennsylvania.
Editor’s Note: A commentary on this article appears on page 818.
Please see the end of this article for information about the authors.
Correspondence should be addressed to Dr. Bernstam, School of Health Information Sciences, The University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, TX 77030; e-mail: (Elmer.V.Bernstam@uth.tmc.edu).
Increasingly, researchers spend less time in their “wet labs” gathering data and more time on computation. As a consequence, more researchers find themselves working in teams to harness the new technologies…. Digital methodologies—not just digital technology—are the hallmark of tomorrow’s biomedicine. - —The Biomedical Information Science and Technology Initiative (NIH, 1999)
Biomedical research increasingly depends on information technology (IT).1–4 Managing, communicating, and analyzing large quantities of data are critical research functions. Thus, many laboratories now host more computers than human beings. However, working in today’s biomedical research environment requires more than simply placing a computer on the researcher’s desktop or even digitizing all of the data.
Broadly speaking, biomedical research faces two related but distinct sets of computational challenges. The first relates to IT, including its selection, procurement, implementation, maintenance, and user support. The second concerns data, information, and knowledge rather than technology. Specifically, there is a growing recognition of the challenges that arise when biomedical information is digitized and manipulated by computers. This has led to the inherently interdisciplinary field of biomedical informatics that combines quantitative disciplines, such as computer science and statistics, with social sciences, such as communications and psychology, and application domains like biology and clinical medicine.5
Recognizing that biomedical informatics is critical to its overall goals, the National Institutes of Health (NIH) required an informatics component within each Clinical and Translational Science Award (CTSA).6 Still, “in many circles [biomedical] ‘informatics’ is coming to mean ‘anything one does with a computer.’”7 Both IT and informatics are critical to modern biomedical research. However, failure to appreciate the differences between them can create frustration for biomedical researchers as well as for IT and informatics professionals.8 More important, confusion regarding the proper roles of computationally oriented groups in biomedical research can lead to delays in productivity and even failure of projects that rely on the inappropriate group for critical tasks.
To address such confusion, we examine the distinction between biomedical informaticians, computer scientists, and IT professionals as well as the synergies that must be developed among these computationally oriented groups within academic health centers (AHCs). The issues we address have implications for students planning their careers (What constitutes a career in informatics?), researchers seeking collaborators and applying for grants (Who should be my collaborators?), principal investigators managing research programs (Who do I ask to do what?), and administrators and funding agencies (Where do I allocate scarce resources? What programs should I build/enhance?). Although we intend this article to be generally applicable, we write from the viewpoint of the CTSA program to clarify the role of biomedical informatics cores and to inform the transformation of clinical and translational research expected from the CTSA program.9
This consensus statement represents the combined effort of 24 CTSA grantee institutions (2006 and 2007 grantees) as well as the NIH. The writing committee (E.V.B., J.W.S., and M.J.B.) drafted a statement on behalf of the biomedical informatics steering committee. A single representative from each institution and the NIH collected and synthesized feedback on behalf of his or her organization. Although all coauthors agreed on the importance of the topic and the need for clarification, we recognize that no statement can address all relevant issues, represent all points of view, or satisfy all critics.
Three distinct and complementary computing groups collaborate with biomedical researchers10: IT, computer science, and biomedical informatics. We further distinguish operational IT from research IT support groups. Operational IT groups focus on supporting generic capabilities, such as desktop computers, networks, and office software. On the other hand, research IT supports the IT needs of biomedical researchers. These needs may include support of research-specific hardware (e.g., computer that controls a DNA sequencing machine) and software (e.g., for microarray data analysis). Thus, in contrast to operational IT professionals, research IT professionals may need to understand specific biomedical research issues.
Although there is overlap among them, we separately describe each of the three groups’ roles and each group’s relationship to the other groups. We use the example of a clinical data warehouse (CDW) to illustrate the contributions of each group. A CDW is a shared database that collects and integrates patient data from a variety of sources. Unlike electronic medical records, CDWs allow queries about groups (e.g., average age of patients with diabetes) rather than individuals (e.g., John Smith’s age) and are thus important clinical and translational research resources.11
Operational IT support
Operational IT groups implement and maintain e-mail and database servers, networks, online storage, and backup systems; support personal computers; and ensure IT security and compliance with institutional policies. IT support professionals may have vocational (“on the job”) training, certification in specific technologies (e.g., Microsoft Certified Professional12), or a formal degree in computer science, management information systems, or another field. Some may consider IT and computer science to be part of the same continuum. Thus, IT represents applied computer science. However, it is important to note that in contrast to academic (PhD-level) computer scientists, IT professionals are not required to have training in the conduct of science. Further, there is a fundamental distinction between IT (application of IT) and computer science (research focused on computing). In contrast to informaticians, IT professionals are not required to have training in core informatics areas, such as decision support, knowledge representation, or human-computer interaction.
We cannot overemphasize the importance of effective and efficient IT operations. E-mail is one important example. Investigators use e-mail to share ideas, working documents, and data sets. Similarly, identity management, networking, server management, and backup operations are fundamental to any modern complex industry and are essential to biomedical research.
A CDW generally resides on a centralized server, but users access the CDW with personal computers, perhaps via a Web interface. IT support professionals are responsible for selecting, purchasing, maintaining, and supporting these personal computers and the infrastructure on which a CDW is built. This infrastructure includes the server(s) on which the CDW runs, security, backup, and disaster recovery systems, and the networks connecting the CDW to its data sources, such as clinical, laboratory, and radiology systems.
Research IT and operational IT face different demands from different user communities. Some institutions have a central research IT group for common research needs (e.g., high-capacity storage with backup). However, compared with operational IT teams, research IT groups are typically “local” to departments or laboratories and support users who collect data via specialized equipment or analyze complex data sets via a variety of special-purpose software packages that change frequently depending on specific researcher needs or preferences.1 For example, researchers may write custom software to analyze microarray data or modify existing software to meet their needs. Thus, each workstation may be unique, and their configurations may change frequently.
If a problem arises, the IT professional cannot simply reinstall the system from generic backup images. As a result, research IT must be able to tolerate changes and disruptions that would cause havoc in a large operational IT group responsible for mission-critical applications. Further, increasing regulation of biomedical research has numerous implications for information management (e.g., requirements for HIPAA-compliant storage of protected health information). Thus, research IT professionals must be familiar with research-specific processes and regulations.13,14 In contrast, operational IT is often centralized within institutions and is accustomed to handling large-scale projects that serve many individuals.3
Researchers may be computationally and/or scientifically sophisticated but still require help with advanced functions or with unusual tasks. Compared with administrative computing, the hardware and software needs for research, especially when it involves very large data sets or computations, are also far greater. As a result, research IT groups must allow users greater autonomy and must manage a more heterogeneous hardware/software environment. Different skills may be required for research IT, and therefore a division of a given AHC’s overall IT organization into “research IT” and “operational IT” may be warranted.
Research IT budgets should reflect the greater resource requirements per client compared with operational IT. Increasingly, researchers recognize that IT should be included on grant budgets because research IT is rarely fully supported by the AHC. There are multiple options for funding research IT including charging “user fees” of funded projects or “taxing” laboratories a flat fee. The most appropriate option depends on the institution, but it is important to recognize that research IT requires dedicated and highly skilled resources.
IT support is becoming even more important as research data migrate from personal computers to institutional servers. Operational IT groups are well equipped to provide user support for general-purpose office automation tools and to ensure smooth operation of data centers that house servers. In contrast, research IT groups can support specialized laboratory software, high-performance computing (e.g., Linux clusters), and workstations increasingly used by clinical and translational scientists.
As biomedical research becomes more data-intensive, traditional data storage and analysis approaches fail.1 For example, large-scale efforts within CTSA programs such as CDWs must accommodate terabytes to petabytes of data on thousands of subjects (1 petabyte = 1,000 terabytes = 1015 bytes). General-purpose office automation tools, such as Microsoft Excel, were not designed to handle such large data sets. Instead, centralized computing resources ranging from servers, to networked “Grid” clusters, to shared-access supercomputers running specialized software, are needed to extract useful knowledge from such huge data sets.
As computers become increasingly important in biomedicine, biomedical researchers are starting to collaborate with computer scientists. Like IT professionals, computer scientists concentrate on technology, including computing systems composed of hardware and software as well as the algorithms implemented in such systems. In contrast to both operational and research IT, academic (PhD-level) computer scientists are trained as researchers. They may work in academia or industry, but they are expected to generate new computer science knowledge. Some, but not all, computer science activities advance IT. For example, computer scientists develop algorithms to search or sort data more efficiently and design faster memory or storage architectures and more reliable computer software that is less prone to “crash.”
Though often motivated by specific applications, computer scientists typically develop general-purpose approaches to classes of problems (a characteristic shared with academic biomedical informaticians, as discussed below). For example, a computer scientist may design a memory architecture that works well for storage and retrieval of large data sets in a CDW. The computer science contribution is the development of a better memory architecture for large data sets; although the memory architecture is not a direct improvement of the CDW per se, it is nonetheless critical to its advancement.
Biomedical informatics research and service
Biomedical informaticians focus on the storage, retrieval, and optimum use of data, information, and knowledge for problem solving and decision making in biomedicine.15 To an informatician, computers are tools for manipulating information. Indeed, there are many other useful information tools, such as pen, paper, and reminder cards. There are significant advantages to manipulating digitized data, including the ability to display the same data in a variety of ways and to communicate with remote collaborators. From an informatics perspective, however, one should choose the optimal tool for the information task—often, but not always, this tool is computer based.
Similar to the distinction between computer science (an academic discipline that generates new knowledge) and IT (an applied or engineering discipline that uses computer science to solve real-world problems), there is a continuum from academic to applied informatics (Table 1). Like other researchers, academic informaticians and students pursuing PhD degrees in informatics are expected to ask scientific questions, obtain research funding, assess and identify the generalizability of results, and publish in the scientific literature. In contrast, applied informaticians employ or adapt existing tools. Applied informaticians may work in industry or in academia. They are especially indispensable to organizations wishing to implement large enterprise-wide applications, such as electronic health records.16
Academic and applied informaticians come from a wide variety of backgrounds, including computer science, biology, and/or clinical disciplines. Because biomedical informatics requires interdisciplinary expertise, most informaticians have graduate or postdoctoral training, increasingly in biomedical informatics itself. Informaticians should be computer savvy, but, unlike IT professionals, informaticians are not explicitly trained in specific hardware or software and, therefore, are not well suited to provide researchers with operational IT support. In contrast to computer scientists, informaticians are concerned with application domains, such as biology (bioinformatics), clinical care (clinical informatics), research processes (research informatics), or public health (public health informatics), although the new methods motivated by those domains may have applicability much more broadly—even outside biomedicine.
There are currently 20 biomedical informatics training programs funded by the National Library of Medicine (NLM, the NIH component traditionally involved in fundamental informatics research).17 In addition, there are non-NLM-funded programs and competent informaticians without formal training. The American Medical Informatics Association (AMIA) currently has more than 3,800 members.18 Recognizing the need to develop an informatics workforce rapidly, AMIA launched the “10 × 10” program that aims to train 10,000 people in applied informatics by 2010.19
The necessary and sufficient competencies for a trained biomedical informatician remain controversial.5 For example, should informaticians be able to write computer programs? Some argue that informaticians must have programming experience to effectively supervise software development. Others counter that the task of supervising programmers does not necessarily require programming experience and that precious training time should be spent on other topics. Similarly, the depth to which individual topics are covered differs between programs. Some emphasize cognitive or human factors; others emphasize technology or other quantitative disciplines. Most informatics training programs require some exposure to both quantitative sciences (e.g., computer science, decision science, and statistics) and application domains. In addition, informaticians are trained in core informatics methods, including concept and knowledge representation.
Returning to our example of a CDW, informaticians can help determine how to represent the information to be stored. For example, selecting and properly applying a standard terminology such as the Systematized Nomenclature of Medicine (SNOMED)20 can facilitate interoperability with other systems. If we represent data in two different systems using SNOMED codes, such as “D2-0007F (Pneumonia),” then we can issue a query for all patients with pneumonia the same way for both systems and meaningfully aggregate results. However, there are multiple alternatives, and choosing the best terminology is not always straightforward. Whereas an applied informatician can make the best choice among existing terminology systems, the research informatician has the skills to design new and better terminology systems. For example, research informaticians developed the structure and maintenance procedures for SNOMED. Applied informaticians know how to apply SNOMED to clinical data. In contrast, neither IT professionals nor computer scientists are trained to develop or apply terminologies to clinical and research data.
Increasing use of informatics in biomedical research
We are now able to more robustly represent complex biomedical concepts, such as eligibility requirements for clinical trials and clinical syndromes (e.g., congestive heart failure).21 Thus, informatics is beginning to deliver on its potential, and informaticians are increasingly useful to biomedical researchers. Examples of informatics successes important to biomedical researchers include the MEDLINE database of biomedical literature created and maintained by the NLM but available via multiple interfaces (e.g., Ovid, PubMed), large biological databases such as Genbank, which contains an annotated collection of all publicly available genetic sequences,22 as well as tools to access biological databases (e.g., BLAST),23 and contributions to the Human Genome Project.4 Similarly, clinicians have benefited from MEDLINE and from a variety of informatics innovations such as electronic health records and order-entry systems.
Biomedical researchers may look to an applied (bio)informatician, but probably not an IT professional, to help them access genetic databases using existing tools, such as BLAST. However, much research remains to be done to realize the full potential of informatics in clinical and translational research. Therefore, in addition to supporting biomedical researchers, academic informaticians should collaborate with traditional biomedical researchers and conduct independent research focused on informatics. Research challenges in informatics include formulating models for acquisition, representation, processing, display, and transmission of biomedical information (e.g., into a CDW), developing innovative systems based on these models that deliver information to users, implementing such systems within established organizations, and studying their effects on research and health care.24
Relationships among IT, computer science, and biomedical informatics
As the CDW example illustrates, multiple complementary computational disciplines are necessary for clinical and translational research.
Table 1 contrasts the focus and scope of IT, computer science, and biomedical informatics. Meaningful but relatively distinct scientific research can be conducted in computer science and in biomedical informatics, and both can be useful to biomedical researchers. For example, management of very large databases (≫petabyte size) is currently very challenging. Database methods and high-performance computing (“supercomputing”) research are well-established areas of computer science. Therefore, “IT research” (i.e., research to advance IT, not support for biomedical research) often falls within the domains of computer science, management information systems, and operations research, not informatics. Research into knowledge representation for biomedical concepts, however, is clearly within the scope of biomedical informatics.
Implications for AHCs
Because both IT support and informatics are required to conduct biomedical research, both should be reflected in the administrative or academic structure of AHCs. Specifically, a chief information officer (CIO) should lead the IT organization with appropriate emphasis on research and operational IT, preferably as separate subunits.
CIOs at non-AHCs have the ability to focus solely on the operational and clinical mission of the organization. Success in this setting can be measured in server and network up-time and in the responsiveness of the IT infrastructure. The additional priority of AHCs to advance the science of medicine and support education25 requires leadership that is knowledgeable of the special IT requirements of the biomedical research community and that is appropriately incentivized to be responsive to research needs. The CIO should have an independently negotiated budget with dedicated staff and should advise senior administration on the strategic use of information systems.
Close cooperation between operational IT, research IT, and biomedical informatics is critical. Neither IT nor informatics alone can support the increasingly complex computing needs of biomedical research. Without IT, there is no infrastructure. Without informaticians, poorly specified or even harmful computer systems can be installed.26 These groups must collaborate closely to avoid expensive investments in redundant or incompatible systems. Although recent surveys did not differentiate between informatics and IT, they showed that AHCs were not investing sufficient resources into IT, especially IT support for research activities.1,3,8,27 Consequently, requests for IT support (e.g., server setup and configuration in an IT-controlled data center) are often directed to informaticians who are neither funded nor (necessarily) qualified to satisfy these requests. Such requests rarely go to computer science faculty in general university settings, perhaps because, unlike biomedical informaticians, they reside outside hospitals or medical schools. For example, computer science departments rarely operate university computing centers or network infrastructures.
Informatics units with a designated leader are required to provide a professional and/or academic home for informaticians, just as distinct units are required for other investigators and practitioners within AHCs (e.g., statisticians, oncologists, pathologists). Multiple models have been successful, ranging from sections within a clinical department (e.g., Stanford University), to departments within a medical school (e.g., University of Pittsburgh, Columbia University, Vanderbilt, Oregon Health & Science University), to institutes or schools (e.g., University of Texas Health Science Center at Houston). Regardless of the informatics unit type, the leader should be a credible role model who understands technology well enough to provide strategic leadership and vision for the institution. The leader should be empowered and held accountable by the institution to represent the unique needs and abilities of informatics within the larger organization.
Faculty informaticians must be supported with respect to promotion and tenure. They should be encouraged to lead independent research programs and to support traditional biomedical research. Informatics has its own culture that reflects connections to multiple fields including biomedicine as well as computer and information sciences. Grants and publications are recognized metrics of scientific success, but the specifics vary across disciplines. For example, conference proceedings are relatively undervalued in biomedicine, but they may be very competitive in computer science or informatics (e.g., <10% acceptance rate, comparable with competitive clinical journals). The informatician with publications in competitive conference proceedings should not be penalized when it comes time to review his or her scholarly record for promotion and tenure.
Successful informatics research programs interact with other academic disciplines, such as computer and/or information sciences. Indeed, it is difficult to find an NLM-funded informatics program without access to other appropriate academic units. A distinct informatics unit with a strong leader can facilitate such collaborative interactions, even across schools within a university, and occasionally among multiple universities. For example, the CTSA program is an example of such collaboration. Informatics component leaders interact with computer scientists, biostatisticians, biomedical researchers, and others as they strive to transform clinical and translational research within their institutions and across the nation.
Informaticians who are not in academic faculty positions, either because they play an operational role or work in a nonacademic institution, must also be supported. As in any profession, there should be commonly accepted competencies, a society that supports both academics and professionals (such as AMIA), and a means for professional growth and advancement.28
In addition, informatics units educate the next generation of informaticians and teach informatics skills to biomedical researchers and clinicians. The CTSA informatics national steering committee formed a project group on education to address the informatics training needs of researchers. Similarly, some professional schools and societies encourage or even require their students or members to demonstrate informatics competencies.29–32 For example, the Association of American Medical Colleges Medical Student Objectives Project lists “the ability to retrieve (from electronic databases and other resources), manage, and utilize biomedical information for solving problems and making decisions that are relevant to the care of individuals and populations” as a core competency.33 Similarly, the American Association of Colleges of Nursing requires informatics competencies such as knowledge of standards relevant to health information systems of doctor of nursing practice graduates.34
We emphasize that IT and informatics are distinct, but both are necessary for a robust clinical and translational research effort, and they must coexist within AHCs.8 Biomedical researchers have domain-specific computational needs (e.g., create and maintain a cardiology outcomes database). Thus, it may be practical for a large research unit to have a formal or informal subunit with domain-specific informatics expertise (e.g., experience managing cardiology data). This unit would interact with domain-independent biomedical informaticians that would focus on core informatics methods, such as decision analysis or machine learning. Regardless of the model adopted, a single point of contact for computing needs can help ensure that biomedical researchers are aware of available computational resources.8
Biomedical informatics is increasingly visible within the larger research community. AHCs should develop and maintain IT units, headed by a CIO reporting to central administration, as well as distinct biomedical informatics units with capable leaders. In addition to collaborative support for traditional biomedical research efforts, informatics units should develop faculty with independent research agendas that address the informatics challenges of modern biomedical research. Within CTSAs, informatics components complement, but do not replace, IT organizations.
The authors are indebted to Drs. Curtis Cole (Weill Medical College of Cornell University) and Milton Corn (National Library of Medicine) for their support and guidance.
Supported by the CTSA consortium including National Center for Research Resources grants 1UL1RR024148 (UT Houston), 1UL1RR024146 (UC Davis), 1UL1RR024975 (Vanderbilt), 1UL1RR024128 (Duke), 1UL1RR024986 (University of Michigan), 1UL1RR024143 (Rockefeller), 1UL1RR024989 (Case Western Reserve University/Cleveland Clinic), 1UL1RR025014 (University of Washington), 1UL1RR025011 (University of Wisconsin- Madison), 1UL1RR024134 (University of Pennsylvania), 1UL1RR025005 (Johns Hopkins), 1UL1RR024156 (Columbia), 1UL1RR024160 (University of Rochester), 1UL1RR024979 (University of Iowa), 1UL1RR024996 (Cornell), 1UL1RR024131 (UC-San Francisco), 1UL1RR024140 (Oregon Health & Science University), 1UL1RR024982 (UT-Southwestern), 1UL1RR024992 (Washington University), 1UL1RR024153 (University of Pittsburgh), 1UL1RR024150 (Mayo), 1UL1RR024139 (Yale), 1UL1RR024999 (University of Chicago), and 1UL1RR025008 (Emory).