A study on patients using the AI bot to self-manage diabetes finds important limitations that the public needs to be educated about.

A patient came to our clinic several months ago experiencing symptoms of hunger and restlessness. He had “consulted” ChatGPT and was told the symptoms were caused by the insulin we had prescribed him for diabetes control. He then stopped all his insulin injections – which led to a significant worsening of his condition.

A doctor would have explained that his symptoms might have been due to low blood sugar levels, which could happen if he had injected more insulin than needed for his meal portions. The appropriate action to take in such a situation is to reduce the insulin dose, instead of stopping it completely.

This case highlights findings from research undertaken by us at Singapore General Hospital’s Department of Endocrinology on how effective and accurate ChatGPT is for medical advice. It involved a study on diabetes.

We used a simulated example of a patient with diabetes and asked ChatGPT a series of unstructured questions on diabetes self-management. Besides givingsome factually inaccurate information that appeared very convincing, it also lacked nuance in its approach to explain or interpret blood glucose results – as seen in the example of our real-life patient.

This is unsurprising – to medical practitioners at least – considering ChatGPT was trained on a general information dataset, and not one that is medicine-specific.

Our research on the use of ChatGPT for diabetes patient education clearly illustrates its limitations. Patients need to be aware of them.

This is an urgent issue, given that ChatGPT has already started to change how patients access medical information, similar to how patients turned to “Dr Google” when search engines took off. Other colleagues have shared similar stories of how their patients have asked ChatGPT to interpret blood glucose readings, for nutrition or meal plans, or even for emotional support to cope with anxiety related to self-management of their medical conditions.

Since its public release in November 2022, ChatGPT has taken the world by storm, becoming the fastest-growing consumer application in history, with 100 million users in just two months.

HOW IT DIFFERS FROM DR GOOGLE

But what exactly is ChatGPT? It is an artificial intelligence (AI) chatbot that allows users to have human-like conversations, unlike earlier generations of chatbots which could respond only to a narrowly defined set of queries to deliver pre-set structured responses.

The underlying large language model powering ChatGPT has been trained on immense amounts of information from varied sources such as books, webpages and papers. It can also condense complex and technical information into simple parts – a fact no doubt appreciated by our patients who have tried using it for medical advice.

However, ChatGPT is inherently unable to evaluate if the information it gives is right or wrong. It may even present factually incorrect information in a persuasive and linguistically fluent manner, a phenomenon known as “hallucination”. This was the case for our patient.

We recently learnt of another patient who had similarly stopped his regular medication for blood pressure control, as he was under the wrong impression that it had caused his kidney stones. When his doctor asked how he came about this information, he said he had asked ChatGPT, which even provided a reference that looked legitimate but ultimately turned out to be false.

There is also a chance that the information it gives can randomly vary for the same question asked in different ways, as the AI model uses probability to generate a response.

This presents a wholly unique challenge different from “Dr Google”. Patients often tell us that they are overwhelmed by the abundance of medical information and advice available readily online. Savvier patients have learnt to filter the “noise” and look up information from reliable sources.

However, this would not work when evaluating information from ChatGPT. It has been known to fabricate sources of information, providing references or authors that appear convincing but actually do not exist.

Importantly for patients, they need to be aware that the open-access GPT3.5 model on which ChatGPT is based was trained using information available before 2021.

Medicine is a fast-paced field with new developments and guidelines released daily. So the information provided by ChatGPT may not be up to date. It remains to be seen if other medicine-specific models in development, like Google’s Med-PaLM, will be better in this regard.

ADVANTAGES – IF USED CORRECTLY

Amid these issues, our study showed that ChatGPT was able to offer generally – but not completely – accurate, easy-to-understand responses and give recommendations that were clear and systematic. The AI chatbot also suggested consulting a medical practitioner in almost every instance, which is an important safety measure.

It may one day be possible to find a role for ChatGPT to help patients find information and understand their conditions better, if the underlying model continues to improve and safeguards are developed to ensure accuracy of information. As the knowledge base of ChatGPT is vast, it can potentially be applied to a wide spectrum of diseases and conditions that we see and treat daily.

Patient education is just the tip of the iceberg. There is enormous potential to leverage ChatGPT and other similar AI chatbots to streamline documentation, enhance productivity and improve how we care for and communicate with patients.

In another study our group published more recently, ChatGPT was able to provide generally safe post-operative advice to urology patients and give appropriate reassurance for minor post-operative symptoms. 

It is encouraging to know that while some inaccuracies were still present, it did not suggest options that could cause direct patient harm. Of course, these studies are exploratory. Further work is required to develop safe and effective applications of such tools in clinical practice.

As for the public, they need to be aware of ChatGPT’s limitations, so they can be more discerning when using it. It is also crucial for medical practitioners to be mindful of what it can and cannot do, so that they can better advise patients who use it, just as we did with search engines in past decades.

Dr Gerald Sng is a senior resident from the Department of Endocrinology at Singapore General Hospital. Associate Professor Bee Yong Mong is head and senior consultant at the same department.