Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn't arrive within 3 minutes, check your spam folder.

Ok, Thanks

Salesforce's xLAM-1B shows competitive performance in function-calling tasks

Salesforce has developed two compact AI models, xLAM-1B and xLAM-7B, using an innovative APIGen pipeline for high-quality data curation, resulting in models that outperform much larger competitors in function-calling tasks and show promise for on-device AI applications.

Ellie Ramirez-Camara profile image
by Ellie Ramirez-Camara
Salesforce's xLAM-1B shows competitive performance in function-calling tasks
Photo by Joan Gamell / Unsplash

Salesforce recently unveiled two tiny models (xLAM-1B and 7B) specializing in function-calling tasks. According to Salesforce, xLAM-1B packs enough punch to outperform OpenAI and Anthropic models which are several times larger at certain function-calling tasks.

The xLAM-1B's model owes its success to Salesforce AI Research's innovative APIGen pipeline. This automated system generates high-quality, diverse, and verifiable datasets for training AI models in function-calling applications. To test the utility and effectiveness of the dataset collected using the APIGen pipeline, the research team trained DeepSeek-Coder-1.3B-instruct and DeepSeek-Coder-7B-instruct-v1.5 using the xLAM (large action model) training pipeline to obtain xLAM-1B and xLAM-7B.

The models were tested against state-of-the-art models using the Berkeley Function-Calling Leaderboard benchmark. The selection of tested models includes several GPT, Claude, Mistral, and Gemini models, as well as less common but performant models in function-calling tasks, such as Command-R, DBRX, and Snowflake-instruct.

This breakthrough has significant implications for on-device AI applications. xLAM-1B's compact size makes it suitable for deployment on smartphones and other devices with limited computing resources, potentially ushering in a new era of powerful, local AI assistants. The benchmark results placed xLAM-7B in the 6th place among 46 texted models, with an overall accuracy of 85.65. Notably, xLAM-7B surpassed some versions of GPT-4o and GPT-4 Turbo, as well as Llama 3 70B, Claude 3 Sonnet, and Claude 3 Opus.

Salesforce's achievement fits the recent trend of challenging the notion that bigger models are always better. As we have seen in several other instances, attaining competitive performance in a small package has important implications for enterprises, edge computing, and industry challenges such as resource availability and sustainability. In the case of xLAM-1B and xLAM-7B, it seems that Salesforce's main motivator is the development of on-device AI-powered agents.

This is a very reasonable strategy for the company to follow, since being able to offer AI solutions for a diverse range of customers, from those with dedicated tech teams and substantial computing resources to those who may not have enough in-house talent or budget to invest in off-the-shelf foundational models and the computing power needed to run them, can only crystalize Salesforce's position as an industry leader.

Ellie Ramirez-Camara profile image
by Ellie Ramirez-Camara
Updated

Data Phoenix Digest

Subscribe to the weekly digest with a summary of the top research papers, articles, news, and our community events, to keep track of trends and grow in the Data & AI world!

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

Read More