Get the latest tech news

Large Language Models’ Emergent Abilities Are a Mirage

A new study suggests that sudden jumps in LLMs’ abilities are neither surprising nor unpredictable, but are actually the consequence of how we measure ability in AI.

Two years ago, in a project called the Beyond the Imitation Game benchmark, or BIG-bench, 450 researchers compiled a list of 204 tasks designed to test the capabilities of large language models, which power chatbots like ChatGPT. Large language models train by analyzing enormous data sets of text —words from online sources including books, web searches, and Wikipedia—and finding links between words that often appear together. The trio at Stanford who cast emergence as a “mirage” recognize that LLMs become more effective as they scale up; in fact, the added complexity of larger models should make it possible to get better at more difficult and diverse problems.

Get the Android app

Or read this on Wired