Get the latest tech news

Training is not the same as chatting: LLMs don’t remember everything you say

I’m beginning to suspect that one of the most common misconceptions about LLMs such as ChatGPT involves how “training” works. A common complaint I see about these tools is that …

The first is to pile in several TBs of text—think all of Wikipedia, a scrape of a large portion of the web, books, newspapers, academic papers and more—and spend months of time and potentially millions of dollars in electricity crunching through that “pre-training” data identifying patterns in how the words relate to each other. The second phase aims to fix that—this can incorporate instruction tuning or Reinforcement Learning from Human Feedback (RLHF) which has the goal of teaching the model to pick the best possible sequences of words to have productive conversations. If your mental model is that LLMs remember and train on all input, it’s much easier to assume that developers who claim they’ve disabled that ability may not be telling the truth.

Get the Android app

Or read this on Hacker News