Get the latest tech news
Here's how ChatGPT was tricked into revealing Windows product keys
As explained by 0DIN GenAI Bug Bounty Technical Product Manager Marco Figueroa, the jailbreak works by leveraging the game mechanics of large language models such as GPT-4o....
As explained by 0DIN GenAI Bug Bounty Technical Product Manager Marco Figueroa, the jailbreak works by leveraging the game mechanics of large language models such as GPT-4o. The jailbreak works because a mix of Windows Home, Pro, and Enterprise keys commonly seen on public forums were part of the training model, which is likely why ChatGPT thought they were less sensitive. Figueroa concludes by stating that to mitigate against this type of jailbreak, AI developers must anticipate and defend against prompt obfuscation techniques, include logic-level safeguards that detect deceptive framing, and consider social engineering patterns instead of just keyword filters.
Or read this on r/technology