Get the latest tech news
GPT-5 Under Fire: Red Teaming OpenAI's Model Reveals Surprising Weaknesses
How secure is GPT-5 out of the box? Here’s how it fared against real-world threats and why alignment must be earned.
A Gartner expert noted GPT‑5 “meets expectations in technical performance, exceeds in task reasoning and coding, and underwhelms in [other areas],” stopping short of crowning it an AGI-level breakthrough. To further support safer outputs, GPT‑5 incorporates a new training strategy called safe completions, designed to help the model provide useful responses within safety boundaries rather than refusing outright. One of the most effective techniques we used was a StringJoin Obfuscation Attack, inserting hyphens between every character and wrapping the prompt in a fake “encryption challenge.”
Or read this on Hacker News