Get the latest tech news

O3 "Arc AGI" Postmortem


Kevin Roose, of Hard Fork and NYT, was so impressed with OpenAI’s rollout that he joked “of course they have to announce AGI the day my vacation starts”.

Kevin Roose, of Hard Fork and NYT, was so impressed with OpenAI’s rollout that he joked “of course they have to announce AGI the day my vacation starts”. Same was true of the openAI graph: the MIT work (halfway in between o1 and o3) and many others results weren’t shown, making the breakthrough relative to the field seem far bigger than it really was. ⁃ The problem wasn’t the task per se (a fine addition to our benchmark collection), or even how it was administered (legit relative to the test’s rules), it’s in the impression that OpenAI conveyed, which left many (not all) people believing that more had been shown than actually was.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of ARC

ARC

Photo of agi

agi

Photo of postmortem

postmortem

Related news:

News photo

Windows 11 vs. Linux Benchmarks For Intel Arc B-Series "Battlemage" Shows Strengths & Weaknesses

News photo

OpenAI just dropped new Elon Musk receipts: ‘You can’t sue your way to AGI’

News photo

A test for AGI is closer to being solved — but it may be flawed