Get the latest tech news

Ban warnings fly as users dare to probe the “thoughts” of OpenAI’s latest model | OpenAI does not want anyone to know what o1 is “thinking" under the hood


OpenAI does not want anyone to know what o1 is “thinking" under the hood.

Nothing is more enticing to enthusiasts than information obscured, so the race has been on among hackers and red-teamers to try to uncover o1's raw chain of thought using jailbreaking or prompt injection techniques that attempt to trick the model into spilling its secrets. One X user reported(confirmed by others, including Scale AI prompt engineer Riley Goodside) that they received a warning email if they used the term "reasoning trace" in conversation with o1. Marco Figueroa, who manages Mozilla's GenAI bug bounty programs, was one of the first to post about the OpenAI warning email on X last Friday, complaining that it hinders his ability to do positive red-teaming safety research on the model.

Get the Android app

Or read this on r/technology

Read more on:

Photo of OpenAI

OpenAI

Photo of users

users

Photo of thoughts

thoughts

Related news:

News photo

OpenAI Threatens To Ban Users Who Probe Its 'Strawberry' AI Models

News photo

Elon Musk’s X Finds Way Around Brazil Ban and Goes Live Again for Many Users

News photo

OpenAI Hires Former Coursera Executive to Expand AI Use in Schools