The Allen Institute for AI (Ai2) has released OLMo 3.1, an updated version of its fully open-source language model family featuring significantly improved reasoning and instruction-following capabilities. Coming less than a month after the November launch of OLMo 3, the release includes two new 32B parameter checkpoints that represent the organization's most performant models to date.

OLMo 3.1 Think 32B extends the original reinforcement learning run with 21 additional days of training on 224 GPUs, delivering substantial performance gains across key benchmarks. The model achieved improvements of over 5 points on AIME math problems, 4 points on ZebraLogic reasoning tasks, and a remarkable 20+ point increase on IFBench instruction-following evaluations.

Alongside Think, Ai2 released OLMo 3.1 Instruct 32B, a larger-scale instruction-tuned model optimized for chat, tool use, and multi-turn dialogue. According to Ai2's evaluations, this represents the strongest fully open 32B-scale instruct model currently available.

The organization also refreshed its RL-Zero checkpoints, releasing OLMo 3.1 RL Zero 7B Code and Math variants. These models benefit from longer, more stable training runs and provide stronger baselines for reinforcement learning researchers.

What distinguishes OLMo from competitors is its commitment to complete transparency. Unlike models that only release final weights, OLMo 3.1 provides the entire "model flow"—every checkpoint, dataset, and training decision. Researchers can trace model behaviors back to specific training data using the integrated OlmoTrace tool, enabling unprecedented insight into how AI systems develop their capabilities.

All models are available through the Ai2 Playground and Hugging Face, with API access coming soon.