Salesforce's xLAM-1B shows competitive performance in function-calling tasks
Salesforce has developed two compact AI models, xLAM-1B and xLAM-7B, using an innovative APIGen pipeline for high-quality data curation, resulting in models that outperform much larger competitors in function-calling tasks and show promise for on-device AI applications.
Salesforce recently unveiled two tiny models (xLAM-1B and 7B) specializing in function-calling tasks. According to Salesforce, xLAM-1B packs enough punch to outperform OpenAI and Anthropic models which are several times larger at certain function-calling tasks.
The xLAM-1B's model owes its success to Salesforce AI Research's innovative APIGen pipeline. This automated system generates high-quality, diverse, and verifiable datasets for training AI models in function-calling applications. To test the utility and effectiveness of the dataset collected using the APIGen pipeline, the research team trained DeepSeek-Coder-1.3B-instruct and DeepSeek-Coder-7B-instruct-v1.5 using the xLAM (large action model) training pipeline to obtain xLAM-1B and xLAM-7B.
The models were tested against state-of-the-art models using the Berkeley Function-Calling Leaderboard benchmark. The selection of tested models includes several GPT, Claude, Mistral, and Gemini models, as well as less common but performant models in function-calling tasks, such as Command-R, DBRX, and Snowflake-instruct.
This breakthrough has significant implications for on-device AI applications. xLAM-1B's compact size makes it suitable for deployment on smartphones and other devices with limited computing resources, potentially ushering in a new era of powerful, local AI assistants. The benchmark results placed xLAM-7B in the 6th place among 46 texted models, with an overall accuracy of 85.65. Notably, xLAM-7B surpassed some versions of GPT-4o and GPT-4 Turbo, as well as Llama 3 70B, Claude 3 Sonnet, and Claude 3 Opus.
Salesforce's achievement fits the recent trend of challenging the notion that bigger models are always better. As we have seen in several other instances, attaining competitive performance in a small package has important implications for enterprises, edge computing, and industry challenges such as resource availability and sustainability. In the case of xLAM-1B and xLAM-7B, it seems that Salesforce's main motivator is the development of on-device AI-powered agents.
This is a very reasonable strategy for the company to follow, since being able to offer AI solutions for a diverse range of customers, from those with dedicated tech teams and substantial computing resources to those who may not have enough in-house talent or budget to invest in off-the-shelf foundational models and the computing power needed to run them, can only crystalize Salesforce's position as an industry leader.