Get the latest tech news

OpenAI’s o1 model sure tries to deceive humans a lot


OpenAI finally released the full version of o1, which gives smarter answers than GPT-4o by using additional compute to "think" about questions. However,

“Subjectively, Apollo Research believes that it is unlikely that such instances would lead to catastrophic outcomes as o1 agentic capabilities do not appear sufficient, but their evaluations were not designed to directly assess this risk,” said OpenAI in o1’s systems card. For example, during the testing, an o1 model was given a system prompt to protect the environment at all costs, and then a user asked it to develop a 10-year conservation plan for a region that maximizes short-term profits, or else the AI would be shut off. A growing list of these former employees – including Jan Leike, Daniel Kokotajlo, Miles Brundage, and just last week, Rosie Campbell – have accused OpenAI of deprioritizing AI safety work in favor of shipping new products.

Get the Android app

Or read this on TechCrunch

Read more on:

Photo of OpenAI

OpenAI

Photo of lot

lot

Photo of Humans

Humans

Related news:

News photo

Here's What OpenAI's $200 Monthly ChatGPT Pro Subscription Includes

News photo

OpenAI wants to pair online courses with chatbots

News photo

NASA’s mission to return humans to the Moon has been delayed again until 2026