Get the latest tech news

Anthropic Makes 'Jailbreak' Advance To Stop AI Models Producing Harmful Results


AI startup Anthropic has demonstrated a new technique to prevent users from eliciting harmful content from its models, as leading tech groups including Microsoft and Meta race to find ways that protect against dangers posed by the cutting-edge technology. From a report: In a paper released on Monday...

AI startup Anthropic has demonstrated a new technique to prevent users from eliciting harmful content from its models, as leading tech groups including Microsoft and Meta race to find ways that protect against dangers posed by the cutting-edge technology. From a report: In a paper released on Monday, the San Francisco-based startup outlined a new system called "constitutional classifiers." Other companies are also racing to deploy measures to protect against the practice, in moves that could help them avoid regulatory scrutiny while convincing businesses to adopt AI models safely.

Get the Android app

Or read this on Slashdot

Read more on:

Photo of Models

Models

Photo of Anthropic

Anthropic

Photo of advance

advance

Related news:

News photo

How Thomson Reuters and Anthropic built an AI that lawyers actually trust

News photo

India Lauds Chinese AI Lab DeepSeek, Plans To Host Its Models on Local Servers

News photo

India lauds Chinese AI lab DeepSeek, plans to host its models on local servers