The Qwen team recently announced the release of the Qwen3 model family. This new generation introduces several significant advancements in AI capabilities while maintaining flexibility across different deployment environments.
According to the announcement, Qwen3-235B-A22B, the flagship model, scores competitively against top models like DeepSeek-R1, o1, and Gemini-2.5-Pro across coding, math, and general benchmarks. The smaller MoE model, Qwen3-30B-A3B, purportedly outperforms larger models with significantly fewer activated parameters.
Eight models are being released with open weights under an Apache 2.0 license as part of the Qwen3 launch:
- Two Mixture-of-Experts models: Qwen3-235B-A22B and Qwen3-30B-A3B
- Six dense models: Qwen3-32B, Qwen3-14B, Qwen3-8B, Qwen3-4B, Qwen3-1.7B, and Qwen3-0.6B
A standout feature is the hybrid thinking approach, allowing users to toggle between a thinking mode that breaks down complex problems to provide step-by-step solutions, and a faster non-thinking mode for cases where speed is more important than depth. Also of note are Qwen3's multilingual capabilities. The models support 119 languages and dialects across major language families, making them accessible to users worldwide. According to the Qwen team, this release represents their progress toward more advanced AI capabilities.
All models are available on platforms like Hugging Face, ModelScope, and Kaggle, with deployment support through frameworks like SGLang and vLLM. Tools such as Ollama, LMStudio, and llama.cpp are recommended for local usage. The Qwen3 models can also be tested through Qwen Chat Web (chat.qwen.ai) and their mobile app, giving users a hands-on experience with these new capabilities.
Comments