9
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Submit your digital health research with an established publisher
      - celebrating 25 years of open access

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The integration of large language models (LLMs), such as those in the Generative Pre-trained Transformers (GPT) series, into medical education has the potential to transform learning experiences for students and elevate their knowledge, skills, and competence. Drawing on a wealth of professional and academic experience, we propose that LLMs hold promise for revolutionizing medical curriculum development, teaching methodologies, personalized study plans and learning materials, student assessments, and more. However, we also critically examine the challenges that such integration might pose by addressing issues of algorithmic bias, overreliance, plagiarism, misinformation, inequity, privacy, and copyright concerns in medical education. As we navigate the shift from an information-driven educational paradigm to an artificial intelligence (AI)–driven educational paradigm, we argue that it is paramount to understand both the potential and the pitfalls of LLMs in medical education. This paper thus offers our perspective on the opportunities and challenges of using LLMs in this context. We believe that the insights gleaned from this analysis will serve as a foundation for future recommendations and best practices in the field, fostering the responsible and effective use of AI technologies in medical education.

          Related collections

          Most cited references65

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

          We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment

            Background Chat Generative Pre-trained Transformer (ChatGPT) is a 175-billion-parameter natural language processing model that can generate conversation-style responses to user input. Objective This study aimed to evaluate the performance of ChatGPT on questions within the scope of the United States Medical Licensing Examination Step 1 and Step 2 exams, as well as to analyze responses for user interpretability. Methods We used 2 sets of multiple-choice questions to evaluate ChatGPT’s performance, each with questions pertaining to Step 1 and Step 2. The first set was derived from AMBOSS, a commonly used question bank for medical students, which also provides statistics on question difficulty and the performance on an exam relative to the user base. The second set was the National Board of Medical Examiners (NBME) free 120 questions. ChatGPT’s performance was compared to 2 other large language models, GPT-3 and InstructGPT. The text output of each ChatGPT response was evaluated across 3 qualitative metrics: logical justification of the answer selected, presence of information internal to the question, and presence of information external to the question. Results Of the 4 data sets, AMBOSS-Step1 , AMBOSS-Step2 , NBME-Free-Step1 , and NBME-Free-Step2 , ChatGPT achieved accuracies of 44% (44/100), 42% (42/100), 64.4% (56/87), and 57.8% (59/102), respectively. ChatGPT outperformed InstructGPT by 8.15% on average across all data sets, and GPT-3 performed similarly to random chance. The model demonstrated a significant decrease in performance as question difficulty increased ( P =.01) within the AMBOSS-Step1 data set. We found that logical justification for ChatGPT’s answer selection was present in 100% of outputs of the NBME data sets. Internal information to the question was present in 96.8% (183/189) of all questions. The presence of information external to the question was 44.5% and 27% lower for incorrect answers relative to correct answers on the NBME-Free-Step1 ( P <.001) and NBME-Free-Step2 ( P =.001) data sets, respectively. Conclusions ChatGPT marks a significant improvement in natural language processing models on the tasks of medical question answering. By performing at a greater than 60% threshold on the NBME-Free-Step-1 data set, we show that the model achieves the equivalent of a passing score for a third-year medical student. Additionally, we highlight ChatGPT’s capacity to provide logic and informational context across the majority of answers. These facts taken together make a compelling case for the potential applications of ChatGPT as an interactive medical education tool to support learning.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found
              Is Open Access

              ChatGPT: the future of discharge summaries?

                Bookmark

                Author and article information

                Contributors
                Journal
                JMIR Med Educ
                JMIR Med Educ
                JME
                JMIR Medical Education
                JMIR Publications (Toronto, Canada )
                2369-3762
                2023
                1 June 2023
                : 9
                : e48291
                Affiliations
                [1 ] AI Center for Precision Health Weill Cornell Medicine-Qatar Doha Qatar
                [2 ] College of Computing and Information Technology University of Doha for Science and Technology Doha Qatar
                [3 ] Information Science Department College of Life Sciences Kuwait University Kuwait Kuwait
                [4 ] Office of Educational Development Division of Medical Education Weill Cornell Medicine-Qatar Doha Qatar
                [5 ] Department of Computer Science and Software Engineering United Arab Emirates University Abu Dhabi United Arab Emirates
                [6 ] Department of Mechanical & Industrial Engineering Faculty of Applied Science and Engineering University of Toronto Toronto, ON Canada
                Author notes
                Corresponding Author: Alaa Abd-alrazaq alaa_alzoubi88@ 123456yahoo.com
                Author information
                https://orcid.org/0000-0001-7695-4626
                https://orcid.org/0000-0002-3235-0860
                https://orcid.org/0000-0001-5038-3044
                https://orcid.org/0000-0002-4025-5767
                https://orcid.org/0000-0002-5804-0342
                https://orcid.org/0000-0003-0505-609X
                https://orcid.org/0000-0002-0861-9743
                https://orcid.org/0000-0001-6797-0448
                https://orcid.org/0000-0002-8586-7564
                https://orcid.org/0000-0002-5762-4186
                Article
                v9i1e48291
                10.2196/48291
                10273039
                37261894
                ea27443d-2527-4616-9829-f8dc9cee5044
                ©Alaa Abd-alrazaq, Rawan AlSaad, Dari Alhuwail, Arfan Ahmed, Padraig Mark Healy, Syed Latifi, Sarah Aziz, Rafat Damseh, Sadam Alabed Alrazak, Javaid Sheikh. Originally published in JMIR Medical Education (https://mededu.jmir.org), 01.06.2023.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Education, is properly cited. The complete bibliographic information, a link to the original publication on https://mededu.jmir.org/, as well as this copyright and license information must be included.

                History
                : 19 April 2023
                : 5 May 2023
                : 15 May 2023
                : 17 May 2023
                Categories
                Viewpoint
                Viewpoint

                large language models,artificial intelligence,medical education,chatgpt,gpt-4,generative ai,students,educators

                Comments

                Comment on this article