Get the latest tech news
AI researcher Andrej Karpathy says he's "bearish on reinforcement learning" for LLM training
Andrej Karpathy, a former Tesla and OpenAI researcher, is part of a growing movement in the AI community calling for a new approach to building large language models (LLMs) and AI systems.
With this approach, LLMs could go beyond simply guessing how a person might respond and start learning to make decisions, testing how well those choices work in controlled scenarios. Andrej Karpathy is critical of reinforcement learning in large language models, especially pointing out that reward functions for cognitive tasks like problem solving are unreliable and easy to manipulate. Karpathy’s argument echoes the views of Deepmind researchers Richard Sutton and David Silver, who also believe future AI should learn from independent experience and action rather than relying mainly on language data or human feedback.
Or read this on r/technology