Get the latest tech news
Ban warnings fly as users dare to probe the “thoughts” of OpenAI’s latest model | OpenAI does not want anyone to know what o1 is “thinking" under the hood
OpenAI does not want anyone to know what o1 is “thinking" under the hood.
Nothing is more enticing to enthusiasts than information obscured, so the race has been on among hackers and red-teamers to try to uncover o1's raw chain of thought using jailbreaking or prompt injection techniques that attempt to trick the model into spilling its secrets. One X user reported(confirmed by others, including Scale AI prompt engineer Riley Goodside) that they received a warning email if they used the term "reasoning trace" in conversation with o1. Marco Figueroa, who manages Mozilla's GenAI bug bounty programs, was one of the first to post about the OpenAI warning email on X last Friday, complaining that it hinders his ability to do positive red-teaming safety research on the model.
Or read this on r/technology