Get the latest tech news
Notes on OpenAI's new o1 chain-of-thought models
OpenAI released two major new preview models today: o1-preview and o1-mini (that mini one is also a preview, despite the name)—previously rumored as having the codename “strawberry”. There’s a lot …
There’s a lot to understand about these models—they’re not as simple as the next step up from GPT-4o, instead introducing some major trade-offs in terms of cost and performance in exchange for improved “reasoning” capabilities. Our large-scale reinforcement learning algorithm teaches the model how to think productively using its chain of thought in a highly data-efficient training process. Effectively, this means the models can better handle significantly more complicated prompts where a good result requires backtracking and “thinking” beyond just next token prediction.
Or read this on Hacker News