Get the latest tech news
TinyZero
Contribute to Jiayi-Pan/TinyZero development by creating an account on GitHub.
Through RL, the 3B base LM develops self-verification and search abilities all on its own Single GPU Works for model <= 1.5B. We run our experiments based on veRL.
Or read this on Hacker News