Get the latest tech news

How do machines ‘grok’ data?

By apparently overtraining them, researchers have seen neural networks discover novel solutions to problems.

In January 2022, researchers at OpenAI, the company behind ChatGPT, reported that these systems, when accidentally allowed to munch on data for much longer than usual, developed unique ways of solving problems. The researchers named the phenomenon “grokking,” a term coined by science-fiction author Robert A. Heinlein to mean understanding something “so thoroughly that the observer becomes a part of the process being observed.” The overtrained neural network, designed to perform certain mathematical operations, had learned the general structure of the numbers and internalized the result. They were using a small transformer — a network architecture that’s recently revolutionized large language models — to do different kinds of modular arithmetic, in which you work with a limited set numbers that loop back on themselves.

Get the Android app

Or read this on Hacker News