Chatbots prone to ‘hallucination’ or inventing info, study finds
CALIFORNIA, UNITED STATES — A new study by artificial intelligence (AI) startup Vectara suggests that popular chatbots like ChatGPT and Google’s Palm have a tendency to confidently “make up” information between 3% to 27% of the time.
The 30-person Vectara team, backed by $28.5 million in seed funding, tested leading chatbots from companies like OpenAI, Google, Meta, and Anthropic. Their analysis found concerning rates of “hallucination,” where chatbots generate false information as if it were fact.
OpenAI’s ChatGPT had the lowest measured hallucination rate at 3%, while Google’s Palm was found to hallucinate up to 27% of the time. Meta and Anthropic’s chatbots landed in the middle with rates around 5-8%.
“We gave the system 10 to 20 facts and asked for a summary of those facts,” said Amr Awadallah, the chief executive of Vectara and a former Google executive.
“That the system can still introduce errors is a fundamental problem.”
The issue stems from how chatbots learn language by ingesting vast amounts of online text, repeating falsehoods and unreliable information found on the internet. Solutions remain unclear, though Vectara and chatbot creators are actively researching improvements.
With global excitement around AI chatbots’ capabilities, their tendency to confidently “invent” information poses risks if used for high-stakes tasks like legal, medical, or business applications.
Vectara specializes in extracting insights from private business documents. They aim to raise awareness on potential chatbot inaccuracies for organizations considering relying on them for sensitive uses.