Get the latest tech news
How outdated information hides in LLM token generation probabilities
Random tech notes
In this article, I’m going to briefly cover some of the basics so we can think this through from first principles and then have a peek at the token generation probabilities, working our way from GPT-2 through to the most recent 4o series of models. Firstly, I’d ask exactly what you mean by this, because hallucination doesn’t have a precise and universally accepted definition, despite being used frequently by CEOs of big tech companies and the media whenever LLMs make mistakes. The scenario that I’m worried about, and that is playing out right now, is that they get good enough that we (or our leaders) become overconfident in their abilities and start integrating them into applications that they just aren’t ready for without a proper understanding of their limitations.
Or read this on Hacker News