The increasing digitization of modern life is likely to have a profound impact on society in general and health care in particular, in ways that are not intuitively understood. Documents, mail, photos, music, video, and maps are all artifacts that now exist predominantly as digital objects, which, together with the Internet traffic that moves them around, create very large sets of data. In this context, “very large” is an understatement. It was estimated that in 2016 we would collectively transmit 1.3 zettabytes of digital information (about 250 billion DVDs). At this rate, we will soon run out of prefixes in the metric system to describe the scale of the data that are generated.1 In the health system, the exponential growth in patient-related information means that contemporary health care will in part be characterized by a move from producing massive amounts of patient data to analyzing and interpreting them to make better inferences and predictions concerning clinical management and outcomes. But the complexity and scale of modern medical data analysis will soon be beyond the limits of human cognition and will require the use of cognitive augmentation in the form of collaboration with artificial intelligence (AI) systems to enhance our ability to make sense of the patient condition.2,3
AI-based technologies have demonstrated successes in many domains of clinical practice, including decision-support systems, diagnosis and prediction, image recognition, and natural language processing.4 There is little reason to doubt that AI-based technology is already transforming many aspects of clinical practice. However, while software developers are often guided by the mantra to move fast and break things, clinicians begin with first, do no harm. As AI makes increasingly impressive contributions in a variety of clinical domains, clinicians will need to have a deeper understanding of their design, implementation, and evaluation if they wish to stay in control of their own professional futures.3 If not, the agenda for health care will be set by the entrepreneurs, investors, and software developers who are building the AI-based systems that will influence future clinical decision making. No matter how benevolent their intentions, they do not have the background and expertise to integrate clinical care into software and will therefore need guidance from those who work at the bedside.5 The regulation of the use of AI in the clinical context must follow democratic principles in the sense that no single, for-profit company should have the sole responsibility for deciding how AI-based systems should be implicated in patient care, which is likely to happen as large corporations in the United States and China increasingly dominate the clinical AI landscape. To this end, clinicians, residents, students, and patients should all have some say about the kind of health system they want in the future, especially if they hope to remain relevant and effective in this emerging health context. However, for all stakeholders to participate in the discussion around the development and ethical implementation of clinical AI, they will first need to understand the language being spoken so that they can critically evaluate the claims being made by AI researchers. The aim of this article is therefore to provide a nontechnical introduction to AI and machine learning (ML) in the context of health care, the challenges that arise, and the resulting implications for clinicians.
AI in Context
Intelligence is the mental ability of an agent to reason, plan, solve problems, think abstractly, comprehend complex ideas, and learn from experience.6 This definition says nothing of the agent’s capacity for self-awareness, consciousness, emotional response, or moral reasoning, nor that the intelligence should replicate human reasoning. In fact, there is nothing essentially human about intelligence and nothing that prevents it from being instantiated in computer code. This matters because human beings tend to—erroneously—associate AI with other human characteristics like empathy, emotion, or morality. There is also no good reason for intelligent machines to imitate biological processes to produce useful results, in the same way that airplane flight is not modeled on the flapping of a bird’s wings. Computer scientists use algorithms to specify the steps necessary to solve a problem by beginning from an initial known state and known input, and then following a series of instructions that ends by producing an output. This kind of algorithmic computation can perform useful functions regardless of how accurately it describes the human cognitive function of problem solving. Therefore, it is important to understand that when AI researchers talk about “learning,” they really mean the process of maximizing the reward function of an algorithm and not the more nebulous concept of learning as it relates to human beings. By anthropomorphizing algorithms, there is a risk of expecting them to display other, human characteristics when, in fact, it should be clear that they are being optimized for the very specific cognitive function of enhanced intelligence.
AI is a diverse research field that includes a wide variety of subdomains, including expert systems, knowledge representation, robotics, natural language processing, intelligent agents, computer vision, navigation, predictive analytics, and planning.7 These are all very different and specialized areas of research but are often referred to under the umbrella term of AI. When “AI” is used to describe such a wide range of technologies and research fields, it becomes meaningless. In addition, different authors use a variety of terms interchangeably when describing AI, including cognitive computing, machine intelligence, and neural networks, which only adds to the confusion. Clinicians must be aware that research in AI is influenced by a variety of intellectual, commercial, and philosophical issues that are implicated in the development of AI-based systems, and that there are many different agendas being served, all of which influence the words used to describe the field.8 The economic impact of AI is significant because most of the value of the emerging field of medical AI is still to be realized, creating the conditions in which claims of progress are often overstated to drive interest and secure further research funding. The ability to make sense of AI research in the context of clinical practice will be essential if the implementation of medical AI is to be driven by those who are closest to the patient, and who will ensure that it is the patient’s interests that are best served in this emerging field of clinical research. Clinicians will need to be informed not only of the research that is happening in this area but also of who and what is influencing the claims being made.
In the early days of AI research, software developers used logical programming techniques to explicitly define all possible pathways through a solution space to achieve an objective. This made early AI programs brittle and unable to adapt to even small changes in carefully controlled scenarios because programmers could not define all of the possible interactions in advance.7 And because clinical reasoning is a cognitive process of probabilistic decision making under conditions of uncertainty, it has generally been considered to be nonreplicable by algorithms because they could not resolve the incomplete and uncertain information that typically describe real-world clinical problems.9 However, modern AI approaches have demonstrated higher levels of success when modeling uncertain outcomes. Many of these advances are tied to progress in a subdomain of AI research known as machine learning (ML), which describes the ability of an algorithm to “learn” (i.e., to maximize the reward function) by finding patterns in large datasets. In other words, the “answers” produced by ML algorithms are inferences made from statistical analysis of very large datasets, expressed as the likelihood of a relationship between variables, and as such they are less like the “correct” answer and more like a good guess at what the answer might be.7
The success of ML is largely a result of 4 characteristics of digital and technological advances made in the past decade.1 The first was the move to cloud-based computing infrastructure, which has seen almost all enterprise and commercial data migrating from personal devices to commercial data centers. Together with the ubiquity of smartphones and the rise of networked devices—including wearable and, soon, ingestible computers—there has been an explosion of user-generated data that exceeds in scale anything that has been seen before.10,11 The second characteristic is the emergence of cheap, powerful computation in the form of custom-built processors that are explicitly designed to run ML algorithms. This means that computers are able to process massive datasets at speeds that were inconceivable only a few years ago, accelerating the pace at which these systems can process information. The third characteristic driving the progress of modern AI research is the development of new classes of algorithms that are better able to adapt to changing conditions and are thus less likely to break when processing the unstructured data found in real-world conditions.7 The final characteristic driving the success of ML is the availability of open-source software that can process data, as well as the relative openness with which researchers are sharing their work. As companies release their software under open-source licenses, they create opportunities for anyone to run the code for AI-based systems, provided they have access to the necessary data. In a certain sense, this has democratized the process of developing AI-based tools and services allowing small teams to compete with billion-dollar companies. This may partly explain the emergence of so many health-related AI-based start-ups in the clinical space. This combination of inexpensive hardware running more advanced, open-source algorithms that process data more quickly than ever before is one of the fundamental drivers of modern AI research and is what makes this generation of AI-based systems qualitatively different from what has come before.
Challenges With ML
In the 3-step process of selecting a dataset, creating an appropriate predictive model, and evaluating and refining the model, there is nothing more critical than the data.12 However, there are significant challenges that arise with respect to data, all of which should influence any decision related to the implementation of AI and ML in health care. While developments in ML have led to improvements in the ability of AI-based systems to augment, and in many cases, surpass, human decision making in a variety of fields, several important issues emerge.
The amount of data generated from and by patients is increasing exponentially and is becoming more difficult to manage in the time frames necessary to be clinically useful. Patient data are being created at a rate that exceeds our ability to gather them, let alone understand them.11 However, what may be more problematic is the fact that, no matter how many medical data are generated, almost all of them are unavailable for analysis. Medical imaging systems are proprietary and prevent interoperability with other systems, making the information contained within them relatively inert and unable to be used for ML.
Class imbalance happens when the training dataset includes more examples of some categories than others. For example, you may want an algorithm to predict the likelihood of a rare disease in a defined population. You may train the algorithm on a dataset that includes almost no examples of the disease (because it is very rare) but then end up achieving a near-perfect score on the testing dataset by predicting zero instances of the condition. In these cases, you would have a lot of confidence in the predictive ability of the algorithm even though the accuracy would really be a side effect of limited data.7 Examples like this demonstrate the need for clinicians to understand how ML works so that they can better interpret the findings of clinical research that uses AI for data analysis.
Solving the wrong problems
Because they may not be explicitly told what problem to solve, ML algorithms may arrive at solutions that were unforeseen by the researchers. For example, a neural network that was designed to diagnose skin cancer ended up detecting the presence of rulers. Because many of the photos of malignant tumors in the training data included rulers used to measure the size of the tumor, the algorithm simply learned that the deciding characteristic of malignancy was the presence of a ruler.13 Recently, a small group of students developed an algorithm that ran on publicly available hardware and outperformed researchers at Google on a standardized benchmarking task. One of the reasons they did so well was that they verified that images in their training dataset were appropriately cropped to ensure that the algorithm was analyzing the correct part of the image.14 Again, it is important for clinicians to understand the principles of ML so that they can provide informed input into discussions related to the influence of AI-based systems in clinical practice.
Correlation and causation
ML focuses on the strength of the correlation between variables rather than the direction of causality. In the clinical context, this means that ML algorithms are able to determine with high accuracy the relationship between a set of symptoms and an appropriate condition. For example, an algorithm can correlate the presence of night sweats, hemoptysis, and sudden weight loss with the condition known as tuberculosis (TB), but it cannot say that TB is the cause of those symptoms. Without being able to determine causation, there are hard limits to what can be achieved in health care using only ML.15
Because the data used to train ML algorithms are generated by human beings and their online interactions, algorithms cannot help but have the same biases embedded in their outputs. In a real sense, these algorithms reflect not only what people think about the world but also what they think about each other. Bias can be introduced into the system through the selection of training datasets, algorithm design and integration, and the interaction of users.7 Algorithmic bias is a complex problem that has led to a call for frameworks to test for bias in AI-based systems. These frameworks will not only help to reduce algorithmic bias but may also highlight the entrenched human biases that lead to social inequality and disparity in access to health care.4,5,16
AI-based technology is becoming more powerful at the expense of transparency and human understanding. The combination of massive datasets and the complexity of modern ML algorithms make it impossible for a human being to validate the output.11 This “black box” nature of algorithmic decision making is justifiably raised as a concern in the high-stakes environment of clinical practice. This is especially troubling given that the current legal status of AI algorithms defines them as decision-support tools, meaning that clinicians are morally and legally liable for poor patient outcomes, even if those decisions were influenced by AI-based systems.4 In response, the National Health Service in the United Kingdom has recommended that companies developing AI in health care should be accountable for system failures, and others have called for the development of explainable AI systems that use plain language to link their decisions directly to the data used in the analysis.17
Implications for Clinicians
In June 2018, the American Medical Association published a set of policy guidelines on augmented intelligence, highlighting the fact that clinicians’ perspectives should inform the development of health care AI.18 Clinicians, students, residents, and attending physicians should aim to be involved in developing guidelines for how AI-based systems are designed, tested, implemented, and evaluated in the clinical context.12 The recent advances in medical AI are largely the result of advances in the subdomain of ML, and as such, they are dependent on the quality of the data being used to train and validate the algorithms. Therefore, an important role for all stakeholders in an era of algorithmic clinical decision making is to ensure that health-related patient data are comprehensive, accurate (valid and reliable), diverse, and well structured. When it comes to biased conclusions, it is not the size of the dataset that matters, nor what particular algorithm is used, but how diverse and inclusive the dataset used for training is. This forces us to ask questions about where the data come from, what inferences are drawn from them, and how relevant those inferences are to the present situation. Much as we ask if clinical findings are generalizable across contexts, so we must ask if algorithmic decisions are generalizable outside of the datasets on which they were trained.
Clinicians, and all other staff involved in patient care, will need to ensure that diverse patients’ voices and perspectives are represented in ML training, while at the same time advocating for the protection of patients’ rights and privacy with respect to how their medical data are collected and used.4 The recent implementation of the European Union General Data Protection Regulation (GDPR; 2016/679) on data protection and privacy for all individuals within the European Union and the European Economic Area may have a significant impact on how private companies and governments are able to use medical data. The implementation of the GDPR will require that those responsible for developing ML algorithms and AI-based systems will need to work closely with both patients (who will have more control over how their personal data are used) and clinicians (who are responsible for managing much of the interpretation of those data). In addition, medical education in general will need to be cognizant of the fact that AI-based systems are already being used in practice and that curricula will therefore need to adapt in response. Medical students, residents, and attending physicians will need to become familiar with ML as part of the undergraduate curriculum, as well as in continuing education programs.
Clinicians will also need to engage with software developers to ensure that systems for capturing patient–clinician interactions enable the collection of data that are accurate and comprehensive.19 There is evidence that poorly designed systems create opportunities for new types of errors to be introduced into clinical decision making, and this is even more true for AI-based systems.4 The implications of a clinical decision-support system designed with a high tolerance for risk may privilege algorithm performance over patient safety.20 Thus, clinical and research regulation remains an essential component of AI-based system design and an area of engagement for which clinicians and clinical researchers are well prepared.
It may soon be impossible for human clinicians to compete with the computation and reasoning ability of smart machines, but this is beside the point. No one is worried that pocket calculators are “smarter” than humans when it comes to calculation, or that computer hard drives “remember” more than they do. It is useful to note that as effective as ML algorithms are at prediction and classification, they cannot parse a paragraph from a child’s picture book and understand what it means. Just as clinicians use X-ray and MRI scanners to augment their poor ability to see through objects, so should they use smart algorithms to enhance their intelligence and reduce the impact that cognitive biases exert on their reasoning. Being better than AI is not the point; it is sufficient that human beings are different and that we use AI to become better at being different. The challenge that clinicians are facing is to bring together computers and people in ways that enhance human well-being, augment human ability, and expand human capacity. For this transformation to be driven by forces from within the health professions, clinicians need to be engaged and contribute to the emergence of a new discipline of AI in health care.
1. Brynjolfsson E, McAfee A. The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies. 2016.New York, NY: WW Norton & Company.
2. Wartman SA, Combs CD. Medical education must move from the Information Age to the age of artificial intelligence. Acad Med. 2018;93:1107–1109.
3. Obermeyer Z, Lee TH. Lost in thought—The limits of the human mind and the future of medicine. N Engl J Med. 2017;377:1209–1211.
4. Harwich E, Laycock K. Thinking on Its Own: AI in the NHS. 2018. London, UK: Reform; https://reform.uk/research/thinking-its-own-ai-nhs
. Accessed June 3, 2019.
5. Char DS, Shah NH, Magnus D. Implementing machine learning in health care—Addressing ethical challenges. N Engl J Med. 2018;378:981–983.
6. Ritchie S. Intelligence: All That Matters. 2016.London, UK: Hodder & Stoughton.
7. Frankish K, Ramsey WM. The Cambridge Handbook of Artificial Intelligence. 2017.Cambridge, UK: Cambridge University Press.
8. Jordan M. Artificial intelligence: The revolution hasn’t happened yet. Medium. April 18, 2018. https://medium.com/@mijordan3/artificial-intelligence-the-revolution-hasnt-happened-yet-5e1d5812e1e7
. Accessed April 28, 2019.
9. Kaplan RM, Frosch DL. Decision making in medicine and health care. Annu Rev Clin Psychol. 2005;1:525–556.
10. Susskind R, Susskind D. The Future of the Professions: How Technology Will Transform the Work of Human Experts. 2015.Oxford, UK: Oxford University Press.
11. Topol E. The Patient Will See You Now: The Future of Medicine Is in Your Hands. 2015.New York, NY: Basic Books.
12. Verghese A, Shah NH, Harrington RA. What this computer needs is a physician: humanism and artificial intelligence. JAMA. 2018;319:19–20.
13. Patel NV. Why doctors aren’t afraid of better, more efficient AI diagnosing cancer. Daily Beast. December 11, 2017. https://amp.thedailybeast.com/why-doctors-arent-afraid-of-better-more-efficient-ai-diagnosing-cancer
. Accessed April 28, 2019.
14. Howard J. Now anyone can train Imagenet in 18 minutes. http://www.fast.ai/2018/08/10/fastai-diu-imagenet
. Published August 10, 2018. Accessed April 28, 2019.
15. Pearl J, Mckenzie D. The Book of Why: The New Science of Cause and Effect. 2018.New York, NY: Basic Books.
16. Mittelstadt BD, Allo P, Taddeo M, Wachter S, Floridi L. The ethics of algorithms: Mapping the debate. Big Data Soc. 2016;3:1–21. doi:10.1177/2053951716679679
17. Oakden-Rayner L. Explain yourself, machine. Producing simple text descriptions for AI interpretability. https://lukeoakdenrayner.wordpress.com/2018/06/05/explain-yourself-machine-producing-simple-text-descriptions-for-ai-interpretability
. Published June 5, 2018. Accessed April 28, 2019.
18. American Medical Association. AMA passes first policy recommendations on augmented intelligence. https://www.ama-assn.org/ama-passes-first-policy-recommendations-augmented-intelligence
. Published June 14, 2018. Accessed April 28, 2019.
19. Wachter RM, Howell MD. Resolving the productivity paradox of health information technology: A time for optimism. 2018;320:25–26. doi:10.1001/jama.2018.5605
20. Oakden-Rayner L. Medical AI safety: We have a problem. https://lukeoakdenrayner.wordpress.com/2018/07/11/medical-ai-safety-we-have-a-problem
. Published July 11, 2018. Accessed April 28, 2019.