Get the latest tech news
How do machines ‘grok’ data?
By apparently overtraining them, researchers have seen neural networks discover novel solutions to problems.
In January 2022, researchers at OpenAI, the company behind ChatGPT, reported that these systems, when accidentally allowed to munch on data for much longer than usual, developed unique ways of solving problems. The researchers named the phenomenon “grokking,” a term coined by science-fiction author Robert A. Heinlein to mean understanding something “so thoroughly that the observer becomes a part of the process being observed.” The overtrained neural network, designed to perform certain mathematical operations, had learned the general structure of the numbers and internalized the result. They were using a small transformer — a network architecture that’s recently revolutionized large language models — to do different kinds of modular arithmetic, in which you work with a limited set numbers that loop back on themselves.
Or read this on Hacker News