Get the latest tech news
O3 "Arc AGI" Postmortem
Kevin Roose, of Hard Fork and NYT, was so impressed with OpenAI’s rollout that he joked “of course they have to announce AGI the day my vacation starts”.
Kevin Roose, of Hard Fork and NYT, was so impressed with OpenAI’s rollout that he joked “of course they have to announce AGI the day my vacation starts”. Same was true of the openAI graph: the MIT work (halfway in between o1 and o3) and many others results weren’t shown, making the breakthrough relative to the field seem far bigger than it really was. ⁃ The problem wasn’t the task per se (a fine addition to our benchmark collection), or even how it was administered (legit relative to the test’s rules), it’s in the impression that OpenAI conveyed, which left many (not all) people believing that more had been shown than actually was.
Or read this on Hacker News