Get the latest tech news
Are bad incentives to blame for AI hallucinations?
How can a chatbot be so wrong — and sound so confident in its wrongness?
To illustrate the point, researchers say that when they asked “a widely used chatbot” about the title of Adam Tauman Kalai’s Ph.D. dissertation, they got three different answers, all of them wrong. The researchers compare these evaluations to the kind of multiple choice tests random guessing makes sense, because “you might get lucky and be right,” while leaving the answer blank “guarantees a zero.” The proposed solution, then, is similar to tests (like the SAT) that include “negative [scoring] for wrong answers or partial credit for leaving questions blank to discourage blind guessing.” Similarly, OpenAI says model evaluations need to “penalize confident errors more than you penalize uncertainty, and give partial credit for appropriate expressions of uncertainty.”
Or read this on TechCrunch