Get the latest tech news
Explainer: What's r1 and everything else?
& Everything Else? Is AI making you dizzy? A lot of industry insiders are feeling the same. R1 just came out a few days ago out of nowhere, and then there’s o1 and o3, but no o2.
R1 used GRPO (Group Rewards Policy Optimization) to teach the model to do CoT at inference time. It’s more accurate than R1, but it hops between various languages like English & Chinese at will, which makes it sub-optimal for it’s human users (who aren’t typically polyglots). USA: heavily funded, pour money onto the AI fire as fast as possible China: under repressive export controls, pour smarter engineers & researchers into finding cheaper solutions Europe: regulate or open source AI, either is fine
Or read this on Hacker News