Salesforce Unveils the World’s First LLM Benchmark for CRM
Salesforce announced the world’s first Large Language Models (LLM) benchmark for CRM on June 18, 2024. This launch aims to assist businesses in making informed decisions when evaluating their generative AI models within their CRM systems.•
What new does this benchmark encompass?
Embracing the newly launched measures that come with various first-time features empowers businesses make informed decisions and optimize their CRM strategies. Explore the updates:
1. New Evaluation Framework
The new framework evaluates the performance of LLMs on four major criteria:
- Accuracy: This metric ensures that the more accurate the predictions or recommendations are, the more valuable the outcome will be. Based on this outcome, business teams can take action to improve their customer experience.
Accuracy metrics bring in multiple conditions, as follows:
• If the model is accurate, then other metrics must be considered further.
• If the model is inaccurate, other suggested techniques, such as fine-tuning and prompt engineering must be leveraged.
This metric offers four subcategories:
• Faculty
• Completeness
• Conciseness
• Obediency (Following proper instructions)
- Cost: This metric is the resultant operational cost that depends on the various use cases of the CRM. it is expressed in percentiles and further grouped as high, medium, and low, to bifurcate the cost level. With this, customers can evaluate the cost of LLMs and whether they are within their estimated budget or not.
- Speed: This metric measures LLM’s processing efficiency and responsiveness. It contributes to minimizing wait times and enables sales and service teams to address inquiries and issues promptly, ultimately enhancing the overall user experience.
- Trust and Safety: This metric evaluates the LLM’s capability to secure customer data and business information. Plus, it ensures that LLMs adhere to privacy regulations and refrain from any kind of bias or toxicity in CRM use cases.
2. Evaluates Common Sales And Service Use Cases
The benchmark is specially designed to evaluate various use cases of CRM related to sales and service domains. These use cases include:
- Prospecting
- Lead nurturing
- Sales Opportunity
- Service case summaries
3. Public Leaderboard
This newly launched feature in the benchmark suggests which LLM model best suits your CRM needs, aligning with your business requirements.
Silvio Savarese, EVP & Chief Scientist at Salesforce AI Research, said, “As AI continues to evolve, enterprise leaders are saying it’s important to find the right mix of performance, accuracy, responsibility, and cost to unlock the full potential of generative AI to drive business growth.”
She further added, “Salesforce’s new LLM benchmark for CRM is a significant step forward in the way businesses assess their AI strategy within the industry. It not only provides clarity on next-generation AI deployment but also can accelerate time to value for CRM-specific use cases. Our commitment is to continuously evolve this benchmark to keep pace with technological advancements, ensuring it remains relevant and valuable.”
Why Is This Benchmark So Important?
Salesforce understood the concerns that existed in previous benchmarks and addressed them in the latest one. Therefore, making it an optimal measure to evaluate LLMs.
Issues with existing benchmarks:
- Limited to academic and consumer use cases only
- Minimal business relevance
- Lack of human evaluation capability
- Failed to address accuracy, speed, cost and trust constraints
The above issues have been successfully addressed in the latest benchmark. It includes expert human capabilities and uses real-world CRM data that enables businesses to make more strategic decisions related to the incorporation of generative AI into their systems.
Clara Shih, CEO of Salesforce AI, emphasized the role of AI in business: “Business organizations are looking to utilize AI to drive growth, cut costs, and deliver personalized customer experiences, not to plan a kid’s birthday party or summarize Othello.”
She further elaborated, stating, “Our customers have been asking for a purpose-built way to evaluate and select from among the proliferation of new AI models, and we are thrilled to introduce the world’s first LLM benchmark for CRM to help them navigate the complex landscape of models. This benchmark is not just a measure; it’s a comprehensive, dynamically evolving framework that empowers companies to make informed decisions, balancing accuracy, cost, speed, and trust.”
How Does It Benefit Organizations?
The newly launched benchmark enables businesses to compare LLMs and identify the best possible solution among different models. With this, they can make smart and informed decisions that contribute towards enhancing the customer experience.
Salesforce Einstein 1 Platform helps customers select the best LLM model or design their own models to cater to their unique business requirements.
Ultimately, by using these benchmarks, businesses can deploy more effective and efficient solutions.
Don’t Worry, We Got You Covered!
Get The Expert curated eGuide straight to your inbox and get going with the Salesforce Excellence.