Get the latest tech news
Salesforce builds ‘flight simulator’ for AI agents as 95% of enterprise pilots fail to reach production
Salesforce launches CRMArena-Pro, a simulated enterprise AI testing platform, to address the 95% failure rate of AI pilots and improve agent reliability, performance, and security in real-world business deployments.
“Pilots don’t learn to fly in a storm; they train in flight simulators that push them to prepare in the most extreme challenges,” said Silvio Savarese, Salesforce’s chief scientist and head of AI research, during a press conference. A recent MIT report found that 95% of generative AI pilots at companies are failing to reach production, while Salesforce’s own studies show that large language models alone achieve only 35% success rates in complex business scenarios. Unlike existing benchmarks that test generic capabilities, the platform evaluates agents on real enterprise tasks like customer service escalations, sales forecasting, and supply chain disruptions using synthetic but realistic business data.
Or read this on Venture Beat