DIGITAL LIFE

Researchers have discovered that chatbots can be used to detect “hallucinations” from other generative AI models. In the study recently published in the journal Nature, the team explains that systems such as OpenAI's ChatGPT and Google's Gemini can help find and correct inaccurate responses provided by other chatbots.
Oxford researchers have discovered that chatbots like ChatGPT and Gemini can be used to detect “hallucinations” from other generative AI models;
The team asked a chatbot a series of questions and then evaluated the responses against another model looking for inconsistencies;
The questions were also passed by human evaluators to determine the reliability of comparisons between chatbots;
AI models agreed with human resolutions in 93% of cases;
The team highlights that chatbots' hallucinations and inaccurate responses help hinder widespread adoption of the systems – especially in areas such as medicine;
The study was published in the journal Nature.
For the study, Sebastian Farquhar, a computer scientist at the University of Oxford, in the United Kingdom, and his team asked a series of common questions to a chatbot – mathematical problems from elementary school, for example. A second template was used to review the generated responses for inconsistencies and distortions.
The questions asked to the chatbots also went through human evaluators. As the researchers explain in the article, the AI agreed with humans' resolutions in 93% of cases, and human evaluators agreed with each other 92% of the time. The result helped the team determine the reliability of comparing responses from different chatbots.
To The Washington Post, Farquhar explains that, for the average reader, it is “very difficult” to identify some AI errors, as chatbots “often tell you what you want to hear, inventing things that are not only plausible, but would be useful if they were true, something the researchers labeled 'flattery'.”
The team also says that the hallucinations of generative AI help create a barrier that prevents widespread adoption of the systems – especially in areas such as medicine, where “they could pose a risk to human life.”
mundophone
No comments:
Post a Comment