OpenAI’s medical transcription tool fabricates patient information

CALIFORNIA, UNITED STATES — Whisper, an AI transcription tool developed by OpenAI, is being criticized for generating inaccurate transcriptions, including fabricated content.
According to an ABC News report, despite being praised for its “human level robustness and accuracy,” Whisper has been found to produce hallucinations — fabricated text that can include inappropriate or misleading content such as racial commentary, violent rhetoric, and non-existent medical treatments.
These hallucinations pose significant risks, particularly in healthcare settings where Whisper is increasingly used to transcribe patient consultations. Experts warn that such inaccuracies could lead to serious consequences, including misdiagnoses.
Alondra Nelson, former head of the White House Office of Science and Technology Policy, emphasized the need for higher standards in medical applications.
“Nobody wants a misdiagnosis,” added Nelson, who is also a professor at the Institute for Advanced Study in Princeton, New Jersey. “There should be a higher bar.”
Healthcare adoption raises reliability concerns
Whisper’s integration into various industries, including hospitals and medical centers, has raised concerns about its reliability. Researchers have reported frequent hallucinations in Whisper’s transcriptions.
A University of Michigan researcher conducting a study of public meetings found hallucinations in 80% of audio transcriptions examined. Similarly, other researchers have identified hallucinations in a significant portion of their analyses.
The tool is also used to create closed captions for the Deaf and hard of hearing, a group particularly vulnerable to errors because they cannot easily detect inaccuracies in the text.
Christian Vogler, who is deaf and directs Gallaudet University’s Technology Access Program, highlighted the risk of fabrications “hidden amongst all this other text” going unnoticed among accurate transcriptions.
Calls for regulation and improved accuracy
The prevalence of Whisper’s hallucinations has led to calls for federal regulation of AI technologies.
Former OpenAI engineer William Saunders suggested that addressing these issues should be a priority for the company.
“This seems solvable if the company is willing to prioritize it,” he stated.
An OpenAI spokesperson acknowledged the issue and said that the firm is already working on reducing hallucinations through ongoing research and updates.
Despite OpenAI’s warnings against using Whisper in high-risk domains, its adoption continues to grow. Over 30,000 clinicians and 40 health systems are using a Whisper-based tool developed by Nabla to transcribe medical visits. However, concerns remain about the lack of original audio retention for verification purposes.
Privacy concerns in medical settings
The use of AI transcription tools like Whisper also raises privacy concerns.
California state lawmaker Rebecca Bauer-Kahan expressed her reluctance to share audio of her child’s medical consultation with tech companies involved in processing these recordings. She emphasized the importance of patient privacy and data protection.
“The release was very specific that for-profit companies would have the right to have this,” said Bauer-Kahan. “I was like, ‘absolutely not.’”
As Whisper continues to be utilized across various sectors, addressing its flaws is crucial to ensure accuracy and maintain trust in AI-driven technologies.