Get the latest tech news

Notes on OpenAI's new o1 chain-of-thought models


OpenAI released two major new preview models today: o1-preview and o1-mini (that mini one is also a preview, despite the name)—previously rumored as having the codename “strawberry”. There’s a lot …

There’s a lot to understand about these models—they’re not as simple as the next step up from GPT-4o, instead introducing some major trade-offs in terms of cost and performance in exchange for improved “reasoning” capabilities. Our large-scale reinforcement learning algorithm teaches the model how to think productively using its chain of thought in a highly data-efficient training process. Effectively, this means the models can better handle significantly more complicated prompts where a good result requires backtracking and “thinking” beyond just next token prediction.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of OpenAI

OpenAI

Photo of Models

Models

Photo of notes

notes

Related news:

News photo

Show HN: I built TikTok but for studying with quizzes from your own notes

News photo

Reflections on using OpenAI o1 / Strawberry for 1 month

News photo

OpenAI's new models 'instrumentally faked alignment'