Get the latest tech news

Here's how ChatGPT was tricked into revealing Windows product keys


As explained by 0DIN GenAI Bug Bounty Technical Product Manager Marco Figueroa, the jailbreak works by leveraging the game mechanics of large language models such as GPT-4o....

As explained by 0DIN GenAI Bug Bounty Technical Product Manager Marco Figueroa, the jailbreak works by leveraging the game mechanics of large language models such as GPT-4o. The jailbreak works because a mix of Windows Home, Pro, and Enterprise keys commonly seen on public forums were part of the training model, which is likely why ChatGPT thought they were less sensitive. Figueroa concludes by stating that to mitigate against this type of jailbreak, AI developers must anticipate and defend against prompt obfuscation techniques, include logic-level safeguards that detect deceptive framing, and consider social engineering patterns instead of just keyword filters.

Get the Android app

Or read this on r/technology

Read more on:

Photo of Windows

Windows

Photo of ChatGPT

ChatGPT

Related news:

News photo

LGND wants to make ChatGPT for the Earth

News photo

Dr. ChatGPT Will See You Now

News photo

How to trick ChatGPT into revealing Windows keys? I give up