AI-generated messages rival clinicians in quality and empathy, study finds

NEW YORK, UNITED STATES — A recent study by NYU Grossman School of Medicine revealed that electronic health record (EHR) messages drafted by generative AI are on par with those written by healthcare professionals.
The study, which involved queries from NYU Langone Health patients, found that AI-generated messages were similar in quality and accuracy to those composed by human clinicians.
AI outperforms humans in empathy, understandability
The research involved 16 primary care physicians who rated 334 AI-drafted messages and 169 human-written messages without knowing their origin. It looked at how primary care doctors viewed patient messages written by AI compared to those written by healthcare professionals.
It also analyzed whether the quality of the responses varied based on who wrote them (doctors or other healthcare workers) and the type of message (like lab results or medication requests). The classifications of these messages were determined by the EHR’s proprietary message classification large language model (LLM) from Epic.
The findings showed that AI-generated drafts were not only comparable in terms of informational content and completeness but also rated higher in empathy, understandability, and tone.
Dr. William Small, the study’s lead author, explained that their results “suggest chatbots could reduce the workload of care providers by enabling efficient and empathetic responses to patients’ concerns.”
Potential to alleviate physician burnout
The study highlights the potential of AI to ease the increasing burden on healthcare providers.
NYU Langone Health has seen a 30% annual increase in patient messages, contributing to physician burnout.
AI tools like the private instance of OpenAI’s Chat GPT-4 used in the study could help manage this influx by drafting responses that physicians can review and send, thus saving time and reducing stress.
Areas for improvement
Despite the promising results, the study also identified areas for improvement. AI-generated messages were found to be 38% longer and used more complex language, writing at an eighth-grade level compared to the sixth-grade level of human drafters.
This complexity could pose challenges for patients with low health or English literacy, potentially exacerbating health inequities. The researchers emphasized the need for further refinement in this area.
Implications for healthcare communication
The study’s findings suggest a significant potential for AI in healthcare communication.
“This work demonstrates that the AI tool can build high-quality draft responses to patient requests,” said Dr. Devin Mann, corresponding author and senior director of informatics innovation at NYU Langone.
With ongoing improvements, AI-generated messages could soon match the quality, communication style, and usability of human-generated responses.
As healthcare systems continue to face increasing demands, AI tools offer a promising solution to enhance efficiency and empathy in patient communication. While challenges remain, the integration of AI in drafting patient messages could significantly reduce the workload on healthcare providers and improve the overall patient experience.