Get the latest tech news

Do Large Language Models learn world models or just surface statistics? (2023)

A mystery Large Language Models (LLM) are on fire, capturing public attention by their ability to provide seemingly impressive completions to user prompts (NYT coverage). They are a delicate combination of a radically simplistic algorithm with massive amounts of data and computing power. They are trained by playing a guess-the-next-word

Large Language Models (LLM) are on fire, capturing public attention by their ability to provide seemingly impressive completions to user prompts ( NYT coverage).They are a delicate combination of a radically simplistic algorithm with massive amounts of data and computing power. This is not a totally surprising fact given we have witnessed so many abilities of large language models, but it’s a solid question to ask about the interplay between the mid-stage products from the two processes: the human-understandable world representations and the incomprehensible high-dimensional space in an LLM. @article{li2022emergent, author={Li, Kenneth and Hopkins, Aspen K and Bau, David and Vi{\'e}gas, Fernanda and Pfister, Hanspeter and Wattenberg, Martin}, title={Emergent world representations: Exploring a sequence model trained on a synthetic task}, journal={arXiv preprint arXiv:2210.13382}, year = {2022}, }

Get the Android app

Or read this on Hacker News