Get the latest tech news
Reflections on Neuralese
[Epistemic status: I've looked at the relevant code enough to be moderately sure I understand what's going on. Predictions about the future, including about what facts will turn out to be relevant, are uncertain as always.] With the recent breakthroughs taking advantage of extensive Chain of Thought (CoT) reasoning in LLMs, there have been many attempts to modify the technique to be even more powerful. One of the natural ideas for improving CoT is to have LLMs perform CoT reasoning in the same latent space that they use for reasoning within a single forward pass, rather than being constrained to the space of possible tokens.
With the recent breakthroughs taking advantage of extensive Chain of Thought (CoT) reasoning in LLMs, there have been many attempts to modify the technique to be even more powerful. Compare CoT and COCONUT (aka Neuralese)It’s unclear how much these results line up with expectations and theoretical limits, since it’s hard to tell how lossy the removed computations are and how effective this type of training can be at taking advantage of the extra efficiency. An intuition pump for why this problem is especially beyond our current interpretability methods is the fact that, because these Neuralese vectors are never converted into natural language and instead used as inputs for the next autoregressive step, they are essentially part of an extended forward pass, going multiple times through the model.
Or read this on Hacker News