Get the latest tech news

Notes on OpenAI o3-mini

OpenAI’s o3-mini is out today. As with other o-series models it’s a slightly difficult one to evaluate—we now need to decide if a prompt is best run using GPT-4o, o1, …

Confusing matters further, the benchmarks in the o3-mini system card(PDF) aren’t a universal win for o3-mini across all categories. We expect o3-mini to be a useful and safe model for doing this, especially given its performance on the jailbreak and instruction hierarchy evals detailed in Section 4 below. I released LLM 0.21 with support for the new model, plus its-o reasoning_effort high(or medium or low) option for tweaking the reasoning effort—details in this issue.

Get the Android app

Or read this on Hacker News