News

Salesforce's xLAM-1B shows competitive performance in function-calling tasks

Salesforce has developed two compact AI models, xLAM-1B and xLAM-7B, using an innovative APIGen pipeline for high-quality data curation, resulting in models that outperform much larger competitors in function-calling tasks and show promise for on-device AI applications.

by Ellie Ramirez-Camara

Updated July 05, 2024

Salesforce's xLAM-1B shows competitive performance in function-calling tasks — Photo by Joan Gamell / Unsplash

Salesforce recently unveiled two tiny models (xLAM-1B and 7B) specializing in function-calling tasks. According to Salesforce, xLAM-1B packs enough punch to outperform OpenAI and Anthropic models which are several times larger at certain function-calling tasks.

The xLAM-1B's model owes its success to Salesforce AI Research's innovative APIGen pipeline. This automated system generates high-quality, diverse, and verifiable datasets for training AI models in function-calling applications. To test the utility and effectiveness of the dataset collected using the APIGen pipeline, the research team trained DeepSeek-Coder-1.3B-instruct and DeepSeek-Coder-7B-instruct-v1.5 using the xLAM (large action model) training pipeline to obtain xLAM-1B and xLAM-7B.

The models were tested against state-of-the-art models using the Berkeley Function-Calling Leaderboard benchmark. The selection of tested models includes several GPT, Claude, Mistral, and Gemini models, as well as less common but performant models in function-calling tasks, such as Command-R, DBRX, and Snowflake-instruct.

This breakthrough has significant implications for on-device AI applications. xLAM-1B's compact size makes it suitable for deployment on smartphones and other devices with limited computing resources, potentially ushering in a new era of powerful, local AI assistants. The benchmark results placed xLAM-7B in the 6th place among 46 texted models, with an overall accuracy of 85.65. Notably, xLAM-7B surpassed some versions of GPT-4o and GPT-4 Turbo, as well as Llama 3 70B, Claude 3 Sonnet, and Claude 3 Opus.

Salesforce's achievement fits the recent trend of challenging the notion that bigger models are always better. As we have seen in several other instances, attaining competitive performance in a small package has important implications for enterprises, edge computing, and industry challenges such as resource availability and sustainability. In the case of xLAM-1B and xLAM-7B, it seems that Salesforce's main motivator is the development of on-device AI-powered agents.

This is a very reasonable strategy for the company to follow, since being able to offer AI solutions for a diverse range of customers, from those with dedicated tech teams and substantial computing resources to those who may not have enough in-house talent or budget to invest in off-the-shelf foundational models and the computing power needed to run them, can only crystalize Salesforce's position as an industry leader.

by Ellie Ramirez-Camara

Updated July 05, 2024

Subscribe to Our Newsletter

Salesforce's xLAM-1B shows competitive performance in function-calling tasks

Weekly AI Highlights Review: October 15–22

Meta has shared another round of research artifacts, including SAM 2.1 and Spirit LM

The social media platform X may be looking to license its content to third-party AI companies

Explore This Week's AI Events with Data Phoenix (October 21st)

DeepMind researchers trained an AI to see if it could mediate discussions on divisive topics

Data Phoenix Digest

Read More

Weekly AI Highlights Review: October 15–22

Meta has shared another round of research artifacts, including SAM 2.1 and Spirit LM

The social media platform X may be looking to license its content to third-party AI companies

Explore This Week's AI Events with Data Phoenix (October 21st)