Get the latest tech news

Salesforce builds ‘flight simulator’ for AI agents as 95% of enterprise pilots fail to reach production


Salesforce launches CRMArena-Pro, a simulated enterprise AI testing platform, to address the 95% failure rate of AI pilots and improve agent reliability, performance, and security in real-world business deployments.

“Pilots don’t learn to fly in a storm; they train in flight simulators that push them to prepare in the most extreme challenges,” said Silvio Savarese, Salesforce’s chief scientist and head of AI research, during a press conference. A recent MIT report found that 95% of generative AI pilots at companies are failing to reach production, while Salesforce’s own studies show that large language models alone achieve only 35% success rates in complex business scenarios. Unlike existing benchmarks that test generic capabilities, the platform evaluates agents on real enterprise tasks like customer service escalations, sales forecasting, and supply chain disruptions using synthetic but realistic business data.

Get the Android app

Or read this on Venture Beat

Read more on:

Photo of Salesforce

Salesforce

Photo of Production

Production

Photo of AI agents

AI agents

Related news:

News photo

How procedural memory can cut the cost and complexity of AI agents

News photo

Enterprise leaders say recipe for AI agents is matching them to existing processes — not the other way around

News photo

Gartner says add AI agents ASAP - or else. Oh, and they're also overhyped