Get the latest tech news

Claude can now stop conversations - for its own protection, not yours


The company has given its AI chatbot the ability to end toxic conversations as part of its broader 'model welfare' initiative.

Claude will only exit chats with users in extreme edge cases, after "multiple attempts at redirection have failed and hope of a productive interaction has been exhausted," Anthropic noted. The decision to give Claude the ability to hang up and walk away from abusive or dangerous conversations arose in part from Anthropic's assessment of what it describes in the blog post as the chatbot's "behavioral preferences" -- that is, the patterns in how it responds to user queries. An assessment of Claude's behavior revealed "a robust and consistent aversion to harm," Anthropic wrote in its blog post, meaning the bot tended to nudge users away from unethical or dangerous requests, and in some cases even showed signs of "distress."

Get the Android app

Or read this on ZDNet

Read more on:

Photo of protection

protection

Photo of conversations

conversations

Photo of Claude

Claude

Related news:

News photo

Anthropic's Claude AI now has the ability to end 'distressing' conversations

News photo

Anthropic: Claude can now end conversations to prevent harmful uses

News photo

Anthropic says some Claude models can now end ‘harmful or abusive’ conversations