Get the latest tech news

How outdated information hides in LLM token generation probabilities

Random tech notes

In this article, I’m going to briefly cover some of the basics so we can think this through from first principles and then have a peek at the token generation probabilities, working our way from GPT-2 through to the most recent 4o series of models. Firstly, I’d ask exactly what you mean by this, because hallucination doesn’t have a precise and universally accepted definition, despite being used frequently by CEOs of big tech companies and the media whenever LLMs make mistakes. The scenario that I’m worried about, and that is playing out right now, is that they get good enough that we (or our leaders) become overconfident in their abilities and start integrating them into applications that they just aren’t ready for without a proper understanding of their limitations.

Get the Android app

Or read this on Hacker News