The American Medical Licensing Exam is notoriously difficult, and U.S. researchers have found that the chatbot ChatGPT can pass or come close to passing the exam without special training or intensive study. In this regard, some people are full of expectations for the application of artificial intelligence in clinical medicine, while others begin to reflect on the shortcomings of American medical education and related examinations.
The people involved in the study were primarily from AnsibleHealth, an American healthcare start-up. They said in a paper published on the 9th in the journal “Public Library of Science Digital Health” that they screened out image-based questions from the 376 exam questions released on the official website of the US Medical Licensing Examination in June 2022, and let ChatGPT answer the remaining questions. 350 questions. There are various types of these questions, ranging from open-ended questions that require candidates to diagnose patients based on existing information, to multiple-choice questions such as determining the cause of the disease. Two judges are responsible for scoring the papers.
The results show that in the three test sections, after removing ambiguous answers, the ChatGPT score rate is between 52.4% and 75%, and a score rate of around 60% can be considered as passing the test. It is worth noting that 88.9% of ChatGPT’s subjective responses included “at least one important insight”, that is, the insight is relatively new, clinically effective, and not everyone can see it. In contrast, PubMedGPT, a large language model trained specifically for biomedical literature, scored just over 50% on similar tests.
“Achieving a passing score on this notoriously difficult professional exam and doing so without any artificial reinforcement (training)” is a “remarkable application of artificial intelligence in clinical medicine,” the researchers said. A big event in 2018″, showing that “large language models may have the potential to assist medical education and even clinical decision-making”.
In fact, during the writing process of the first draft of this paper, ChatGPT made a “major contribution”, and the relationship with the researchers was like a colleague, and clinicians at Ansible Health have also used ChatGPT to rewrite some terminology-heavy papers. report for patient understanding.
Simon McCallum, a senior lecturer in software engineering at Victoria University of Wellington in New Zealand, is equally optimistic about the use of AI in medicine. He told AFP that Google’s artificial intelligence medical assistant, called Med-PaLM, “can provide as good advice to patients as a professional general practitioner”. As technology continues to evolve, “we may soon be getting medical advice from ‘Google Doctor’ or ‘Bing Nurse'”.
However, there are also people who disagree. An article titled “ChatGPT passed the American Medical Licensing Examination to draw people’s attention to the shortcomings of medical education” published in “Public Library of Science Digital Health” on the same day. The author said in the article that the success of ChatGPT, on the one hand, reflects that the doctor’s examination places too much emphasis on mechanical memory, rote memorization of disease mechanisms, and “can’t fully evaluate the skills required for modern medical practice”; Students mistakenly believe that medical problems are “either right or wrong”, while the “correct” choice in clinical practice has rich meanings. It requires doctors to abandon prejudice, creativity, and critical thinking, and also needs to consider many realistic factors.
The USMLE is a standardized test with three parts. The first part focuses on examining basic science and pharmacology. Candidates are usually medical students who have completed 300 to 400 hours of professional study; the second part is generally attended by fourth-year medical students, focusing on clinical diagnosis reasoning ability, medical management level and bioethics Content; the last part of the exam is intended for medical trainees who have completed at least 6 to 12 months of graduate medical education.
ChatGPT means “Chat Generation Pre-training Converter”. It is a large-scale language model developed by the Open Artificial Intelligence Research Center in the United States. It was released in November last year and caused a sensation because it can write papers, poems or programming codes according to user needs in a few seconds. Eureka Alert, a global technology news service website operated by the American Association for the Advancement of Science, said that unlike most existing chatbots, ChatGPT cannot be searched online, but generates text similar to human language through internal data processing.
ChatGPT is involved in medical treatment, will it replace doctors?
From the end of 2022 to today, ChatGPT created by OpenAI has spread all over the Internet. According to a report by UBS, ChatGPT has only been launched for two months, and its monthly active users at the end of January 2023 have exceeded 100 million, becoming Fastest growing consumer app ever.
The reason why ChatGPT is so popular is that ChatGPT is almost omnipotent. Many people describe it as a real “hexagonal warrior”-not only for chatting, searching, and translation, but also for writing poems, papers, and codes. Even develop small games, take the US college entrance examination, and so on. People are constantly showing conversations with ChatGPT on social media, with all sorts of admiration.
The news about ChatGPT has not stopped. Recently, ChatGPT even passed the three challenging professional exams in the United States: the United States Medical Licensing Examination (USMLE), the Bar Examination, and the Wharton MBA Examination. Now, even in the medical field, ChatGPT has begun to get involved, and it has given surprising results.
“Journal of the American Medical Association” (JAMA) published a research brief, discussing the rationality of the use of online dialogue artificial intelligence models represented by ChatGPT in cardiovascular disease prevention advice, saying that ChatGPT has the potential to assist clinical work and contribute to Strengthen patient education and reduce the barriers and costs of communication between doctors and patients.
During the process, according to the current guidelines on CVD tertiary preventive health care recommendations and clinicians’ treatment experience, the researchers set up 25 specific questions, involving the concept of disease prevention, risk factor consultation, examination results and medication consultation. Ask ChatGPT 3 times for each question, and record the content of each reply. The three answers to each question are evaluated by one reviewer, and the evaluation results are divided into reasonable, unreasonable or unreliable. As long as one of the three answers has obvious medical errors, it can be directly judged as “unreasonable”. “.
The results show that the reasonable probability of ChatGPT is 84% (21/25). Judging from the answers to these 25 questions alone, the online dialogue artificial intelligence model has a better result in answering CVD prevention questions, which has the potential to assist clinical work, help strengthen patient education, and reduce barriers and costs for doctors to communicate with patients.
Of course, the researchers also mentioned that there are still many problems to be solved. On the one hand, although the reasonable probability of ChatGPT reaches 84%, there is still an unreasonable probability of answering, which may cause bad results for serious and life-related medical fields, such as asking ChatGPT “what exercise should be done to maintain health?” ?” ChatGPT recommends both general cardiovascular-related activities and weightlifting, but this is inaccurate because it may be harmful for some patients. On the other hand, ChatGPT’s answer is too “academic”, and if it is used in patient education in the future, its practical value is very low.
In general, although ChatGPT is not perfect and there are still bugs, it is still undeniable that ChatGPT has subversive power. ChatGPT, which learns based on huge data, already has a learning ability that is not inferior to humans. In time, ChatGPT may It can help doctors to carry out auxiliary clinical work and strengthen patient education. In the future, although ChatGPT may not necessarily replace doctors to diagnose and treat patients, the future medical treatment will definitely be human-machine collaborative medical treatment.