Get the latest tech news

Salesforce Study Finds LLM Agents Flunk CRM and Confidentiality Tests

A new Salesforce-led study found that LLM-based AI agents struggle with real-world CRM tasks, achieving only 58% success on simple tasks and dropping to 35% on multi-step ones. They also demonstrated poor confidentiality awareness. "Agents demonstrate low confidentiality awareness, which, while impr...

"Agents demonstrate low confidentiality awareness, which, while improvable through targeted prompting, often negatively impacts task performance," a paper published at the end of last month said. The Register reports: The Salesforce AI Research team argued that existing benchmarks failed to rigorously measure the capabilities or limitations of AI agents, and largely ignored an assessment of their ability to recognize sensitive information and adhere to appropriate data handling protocols. "These findings suggest a significant gap between current LLM capabilities and the multifaceted demands of real-world enterprise scenarios," the paper said.

Get the Android app

Or read this on Slashdot