Get the latest tech news
TechCrunch Minute: How Anthropic found a trick to get AI to give you answers it’s not supposed to
If you build it, people will try to break it. And Anthropic found a way to break AI chatbot guardrails through "many-shot jailbreaking."
Such is the case with Anthropic and its latest research which demonstrates an interesting vulnerability in current LLM technology. More or less if you keep at a question, you can break guardrails and wind up with large language models telling you stuff that they are designed not to. Of course given progress in open-source AI technology, you can spin up your own LLM locally and just ask it whatever you want, but for more consumer-grade stuff this is an issue worth pondering.
Or read this on TechCrunch