Decoding ChatGPT’s ‘impact’ on the future of healthcare : Cancer Research, Statistics, and Treatment

Secondary Logo

Journal Logo

Editorial

Decoding ChatGPT’s ‘impact’ on the future of healthcare

Pearce, Hammond1; Roop, Partha

Author Information
Cancer Research, Statistics, and Treatment 6(1):p 91-93, Jan–Mar 2023. | DOI: 10.4103/crst.crst_84_23
  • Open

Neural networks with many computational layers have been used as deep learning models to solve real-world problems.[1] Mostly composed of a network of artificial neurons, these are trained on existing datasets to learn novel patterns in applications ranging from image processing and speech recognition to human-like navigation of autonomous cars.[2] One area that has seen immense progress is natural language processing, which uses a specific class of deep learning model, known as a transformer,[3] which is scalable with respect to model size and training data. These language models have outperformed other deep learning architectures such as recurrent neural networks,[1] which were traditionally used for natural language processing tasks.[4] More recently, language models have formed the backbone of generative artificial intelligence (AI) tools, which are trained over huge, often publicly available datasets to generate human-like responses on diverse topics. Two prominent examples are ChatGPT, which generates human-like responses to text prompts, and Stable Diffusion, which generates images based on text prompts.

ChatGPT is a large language model developed by OpenAI, designed to understand natural language and communicate with humans via a text-based conversational interface.[5] Derived from their GPT-3.5, it has been trained on a tremendous corpus of textual data extracted from the Internet, including books, articles, websites, and software repositories, and then refined using a technique known as reinforcement learning with human feedback, whereby human supervisors encouraged the model to understand instructions and be more conversational.

Presumably because of the broad training corpus used over the billions of parameters making up the underlying GPT models, ChatGPT has demonstrated significant capabilities in a diverse range of tasks. Due to the strength of its prose, ChatGPT is capable of passing tests and examination questions from the law, medical, programming,[6–8] and creative writing and literature domains. Applications may even be built atop it, for instance using its outputs to control robotics.[9,10]

These capabilities are making ChatGPT an extremely popular tool: since its release in November 2022, it is estimated that ChatGPT has already reached more than 100 million unique users,[11] making it possibly the fastest growing consumer application in the last 20 years. Such statistics cannot be ignored—the rise of generative AI such as ChatGPT has been meteoric.

This naturally poses the question: What does this mean for professionals such as those who work in the healthcare sector? The answer to this question was explored by Parikh et al.[12] who surveyed 210 professionals (157, 74.8% from healthcare) with the goal of gauging their opinion on this matter by circulating a short questionnaire comprising nine multiple choice questions.

They found that within this group, there was already significant awareness of ChatGPT and its potential (approx. 63% awareness) and a sizeable minority had already tried asking questions (approx. 42%) to this platform. When asked how much ChatGPT would revolutionize their fields, the majority from both groups expected the change to be less than 50%. Taking a step further and characterizing this change, medical professionals were more likely to believe that any impact of ChatGPT would be positive (approx. 52%), versus other professionals (approx. 38%). Few participants planned to make any significant changes to their career plans in 2023 based on the impact of ChatGPT like generative AI.

We may start by pointing out some positives of this study. First, the study shows that there is considerable awareness regarding generative AI among healthcare professionals. Second, an early publication on this topic, in a medical journal, will help to enhance the awareness among readers of this journal, who are most likely going to be healthcare professionals. However, the major challenges in such a study lie in careful consideration of inclusion and exclusion criteria, identification of suitable multiple choice questions and rationale for their selection, and finally ethics approval. All of these were lacking in the study by Parikh et al. Hence, we are concerned regarding the validity of the findings. We will elaborate on these concerns in the following paragraphs.

The first limitation of this study concerns the relatively small and self-selected sample size. Given that ChatGPT already has more than 100 million users, a group of N = 210 from “a survey link… shared among healthcare and other professionals who had previously participated in our academic activities,” could be distorting the results.[13] Additionally, as there were no well-thought-out exclusion criteria, the study questions appeared somewhat ad-hoc in nature. For example, if the study had been advertised widely, an exclusion criterion could have been a lack of familiarity with ChatGPT. Having established this point, questions 2-4 could have been eliminated in favor of more important questions, which would have been more relevant to the study design. Additionally, question 9 seems random and irrelevant, based on the conclusions drawn. How many people in the study were actually familiar with Steven Hawking’s work that this question was referring to? Instead, some questions could have been crafted about the effectiveness of ChatGPT in helping individuals in their profession. Questions relating to the awareness that ChatGPT can provide biased, inaccurate, and misleading answers could also have been included. Second, with any opinion-based study like this, it is very difficult to quantify what the impact might be, especially given the broad range of responsibilities across professionals in the medical field. How should someone quantify an impact, especially as an arbitrary percentage? For instance, let us imagine a hypothetical scenario where ChatGPT takes over the initial triage of patients based on how they describe their symptoms. Such a process would substantially change patient flow through a hospital emergency department. How would one quantify this change as a percentage? If you were a dentist, maybe the impact of this change on you is low—but if you were an emergency room trauma nurse, then perhaps the impact would be higher. Even then, what would low or high mean, when trying to score an opinion-based impact on an arbitrary above-or- below-50% scale? Is changing or augmenting the duties of your job a big impact or would that be reserved for tasks whereby careers are fundamentally changed?

While there is considerable reason to remain skeptical about any such potential application of ChatGPT in the medical field, consider a different domain, where generative AI is proving to be really disruptive. Few programmers even considered that AI tools might be able to write code. Consider this widely cited survey examining opinions of AI trends from 2018.[14] It does not even mention the potential for code-writing as a skill! Yet, in 2021, just 3 years later, GitHub Copilot was released—a generative AI platform, which now writes up to 40% of code for the 1.2 million developers using it.[15,16] Is it so far-fetched to imagine that the medical field may face such a change in the future? As noted, ChatGPT has already demonstrated the ability to retain and present medical information on par with that of medical students.[17,18]

Unlike programming, healthcare is a highly regulated domain due to the critical nature of medical decisions. A wrong medication, a false diagnosis, or a delayed treatment as a result of an AI-based decision could have catastrophic consequences. Hence, drugs and devices go through many years of testing and validation even before embarking on clinical trials involving human subjects. Any clinical trial, whether involving human or animal subjects, must have suitable ethics approval. Furthermore, any medical device with safety implications needs safety certification by agencies such as the Food and Drug Administration. Hence, the use of generative AI in a domain with serious safety implications needs careful consideration, including the need for suitable regulation.[18]

Of course, at this time the discussion remains firmly hypothetical, given the propensity of models like ChatGPT to ‘hallucinate’,[19] putting them in charge of potentially life-or-death situations seems considerably premature. Still, as we are already seeing substantial demand for the adoption of other kinds of AI throughout the medical setting,[20–24] it seems logical to assume that ChatGPT and its peer models will also soon be in demand, and further research will no doubt explore its applications, especially when such AI can be verified for its clinical efficacy.[25]

ChatGPT itself has an “opinion” on this topic. When prompted with, “Given that you are an artificial AI which knows data about the medical field, what do you think the impact of ChatGPT and other generative AI will be if applied to the medical field?” it responded that it has the potential to revolutionize the field in many ways. By leveraging vast amounts of medical and clinical data from a variety of sources, it claims it will be able to recognize patterns and make connections to help identify diseases, improve diagnosis and treatment, and develop new drugs and treatments, personalize medicine, and assist in medical education and research, concluding, “Overall, generative AI has the potential to improve healthcare outcomes, speed up medical research, and reduce healthcare costs. However, it is important to note that AI is not a substitute for human expertise and judgment. AI should be used to augment the work of healthcare professionals, not replace them.”

Where do we go from here? It is quite clear that the AI “genie” is out of the “bottle.” Generative AI capable of producing vast quantities of human-like prose are proliferating, ChatGPT included. With the release of its application programming interface,[1] it is clear that companies like OpenAI are making large commercial bets on the successful adoption of these tools across industries. So, they certainly seem to believe that there will be an impact—it remains to be seen what this impact will be. The leaders of the healthcare community need to come to terms with this technology on an urgent basis, so that careful regulation may be introduced to govern how AI can be verified and used for healthcare-related purposes.

REFERENCES

1. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521:436–44
2. Ro JW, Roop PS, Malik A, Ranjitkar P. A formal approach for modeling and simulation of human car-following behavior. IEEE Transactions on Intelligent Transportation Systems 2017;19:639–48
3. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in neural information processing systems. Available from: https://papers.nips.cc/paper/2017/hash/3f5e.e243547de.e91fbd053c1c4a845aa-Abstract.html. [Last accessed on 2023 Mar 14]
4. Vig J, Belinkov Y. Analyzing the structure of attention in a transformer language model. Available from: https://arxiv.org/abs/19060.04284. [Last accessed on 2023 Mar 14]
5. OpenAI. Introducing Chat GPT. November 2022. Available from: https://openai.com/blog/chatgpt#OpenAI. [Last accessed on 2023 Mar 14]
6. Karen Sloan. ChatGPT passes law school exams despite 'mediocre'performance. Reuters, January 2023. Available form: https://www.reuters.com/legal/transactional/chatgpt-passes-law-school-exams-despite-mediocre-performance-2023-01-25/. [Last accessed 2023 Mar 14]
7. Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large lan- guage models. PLOS Digit Health 2023;2:e0000198
8. Jalil S, Rafi S, LaToza TD, Moran K, Lam W. ChatGPT and software testing education: Promises & perils 2023. Available from: https://arxiv.org/abs/2302.03287. [Last accessed on 2023 Mar 14]
9. Thorp HH. ChatGPT is fun, but not an author. Science 2023;379:313
10. Vemprala S, Bonatti R, Bucker A, Kapoor A ChatGPT for Robotics: Design principles and model abilities. Technical Report MSR-TR-2023-8. Microsoft. February 2023. Available from: https://www.microsoft.com/en-us/research/publication/chatgpt-for-robotics-design-principles-and-model-abilities/. [Last accessed on 2023 Mar 14]
11. Hu K ChatGPT sets record for fastest-growing user base-analyst note. Reuters 2023. Available from: https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/. [Last accessed on 2023 Mar 14]
12. Parikh MP, Talwar V, Goyal M. ChatGPT: An online cross-sectional descriptive survey comparing perceptions of healthcare workers to those of other professionals. Cancer Res Stat Treat 2023;6:32–6
13. Heckman JJ. Selection bias and self-selection. Microeconometrics. The New Palgrave Economics Collection. London 2010. Available from: https://link.springer.com/chapter/10.1057/9780230280816_29. [Last accessed on 2023 Mar 14]
14. Grace K, Salvatier J, Dafoe A, Zhang B, Evans O. Viewpoint: When will AI exceed human performance?Evidence from AI experts. J Artificial Intel Res 2018;62:729–54
15. Rosenbaum E. Microsoft's GitHub Copilot AI is making rapid progress. Here's how its human leader thinks about it. October 2022. Available from: https://www.cnbc.com/2022/10/14/microsoft-ai-leaps-ahead-heres-what-its-human-leader-thinks-about-it.html. [Last accessed on 2023 Mar 14]
16. Dohmke T. GitHub Copilot is generally available to all developers. June. 2022. Available from: https://github.blog/2022-06-21-github-copilot-is-generally-available-to-all-developers/. [Last accessed on 2023 Mar 14]
17. King MR, chat GPT. A Conversation on artificial intelligence, chatbots, and plagiarism in higher education. Cell Mol Bioeng 2023;16:1–2
18. Hacker P, Engel A, Mauer M. Regulating ChatGPT and other large generative AI models. Available from: https://arxiv.org/abs/23020.02337. [Last accessed on 2023 Mar 14]
19. Rudolph J, Tan T, Tan S. ChatGPT: Bullshit spewer or the end of traditional assessments in higher education?. Journal of Applied Learning and Teaching January 2023. Available from: https://journals.sfu.ca/jalt/index.php/jalt/article/view/689. [Last accessed on 2023 Mar 14]
20. Park SH, Han K. Methodologic guide for evaluating clinical performance and effect of artificial intelligence technology for medical diagnosis and prediction. Radiology 2018;286:800–9
21. Priya H, Purohit BM, Ravi P. Future scope of virtual reality and augmented reality in tobacco control. Cancer Res Stat Treat 2022;5:173–4
22. Mahajan A, Vaidya T, Gupta A, Rane S, Gupta S. Artificial intelligence in healthcare in developing nations: The beginning of a transformative journey. Cancer Res Stat Treat 2019;2:182–9
23. Bharadwaj KSS, Pawar V, Punia V, Apparao MLV, Mahajan A. Novel artificial intelligence algorithm for automatic detection of COVID-19 abnormalities in computed tomography images. Cancer Res Stat Treat 2021;4:256–61
24. Mahajan A, Pawar V, Punia V, Vaswani A, Gupta P, Bharadwaj KS, et al. Deep learning-based COVID-19 triage tool: An observational study on an X-ray dataset. Cancer Res Stat Treat 2022;5:19–25
25. Seshia SA, Sadigh D, Sastry SS. Toward verified artificial intelligence. Communications of the ACM 2022;65:46–55
Copyright: © 2023 Cancer Research, Statistics, and Treatment