Salesforce reveals digital twin for business ops so your business can test AI agents before deployment

Salesforce reveals digital twin for business ops so your business can test AI agents before deployment

Curated from Latest from TechRadar US in News,opinion — Here’s what matters right now:

Many AI pilots fail real-world operations and 95% of GenAI pilots don’t reach production, Salesforce claims CRMArena-Pro lets enterprises stress-test their AI agents with digital twins Two new benchmarks are used for stress-testing AI agents Salesforce says enterprises are struggling with their AI pilots failing in real-world operations, and has launched CRMArena-Pro, a new service to allow businesses to create a digital twin of their operations to stress-test AI agents before they get deployed. The company cited recent MIT research which found 95% of generative AI pilots don’t even reach the production stage. CRMArena-Pro evaluates AI agents on real tasks, like customer service, sales forecasting and supply chain disruptions, but using synthetic data that’s been validated by experts. Salesforce lets you stress-test AI agents using digital twins “CRMArena-Pro creates a rigorous, context-rich simulated enterprise environment framework with synthetic data, where it can safely evaluate API calls to relevant systems, as well as the ability to safeguard PII data,” the company wrote in an announcement . By adding real-world noise into the test environment, CRMArena-Pro can better evaluate performance, strengthen resilience and bridge the gap between pre- and post-deployment. “The result is AI agents that are capable, consistent, trustworthy, and agentic enterprise-ready.” Companies can also see how AI agents handle real-world challenges like messy data, legacy systems and complex workflows. Salesforce noted part of the complexity comes from the vast array of models available to choose today, and knowing which specific model or combination of models to use isn’t so simple. To that tune, the company has published two new benchmarks to measure agent performance: MCP-Eval for evaluation through synthetic tasks and MCP-Universe, which adds real-world tasks and execution-based evaluators to stress-test agents in complex scenarios. In a previous post , Salesforce noted that CRMArena-Pro “lays the groundwork for the next frontier: Enterprise General Intelligence” - and for now, users can expect “safe, capable and impactful” AI for all organizations. You might also like We’ve listed the best AI tools and best AI writers Salesforce unveils Agentforce 3, its smartest agent platform yet Give your workers a helpful boost with the best productivity tools

Next step: Stay ahead with trusted tech. See our store for scanners, detectors, and privacy-first accessories.

Original reporting: Latest from TechRadar US in News,opinion

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.