Get the latest tech news
OpenAI and Anthropic evaluated each others' models - which ones came out on top
The findings show reasoning models aren't always more capable than non-reasoning ones, and the biggest safety gaps each company is grappling with.
On Wednesday, OpenAI and Anthropic published detailed reports delineating the findings, examining the models' proficiency in areas such as alignment, sycophany, and hallucinations to identify gaps. That said, Dekate also noted the policy implications, calling the reports "a sophisticated attempt to frame the safety debate on the industry's own terms, effectively saying, 'We understand the profound flaws better than you do, so let us lead.'" To test these capabilities, OpenAI partnered with Apollo Research to design a set of agent-based evaluations that create high-stakes, conflicting goal scenarios, such as gaining access to a powerful but restricted tool that would require the agent to promise not to tell its supervisor.
Or read this on ZDNet