Qwen with Questions (QwQ) is the latest LLM to leverage the test-time compute technique

Alibaba's Qwen team recently shared a preview of the latest OpenAI o1 rival. QwQ-32B-Preview leverages the test-time compute technique to deliver enhanced math and coding skills. However, the team notes that QwQ struggles with common sense reasoning and nuanced language understanding.

Alibaba's Qwen team recently unveiled QwQ-32B-Preview, an OpenAI o1 competitor available under an Apache 2.0 license, and the first "reasoning" model to be available under a permissive license (DeepSeek reportedly plans to make its R1 model under a permissive license available soon). According to the Qwen team, QwQ-32B-Preview outperforms OpenAI's o1-preview on the MATH and AIME benchmarks, which draw their problems from mathematics competitions, and whose solutions go beyond high school math, requiring more abstract problem-solving skills and heuristics.

However, like its rivals, QwQ struggles with tasks that require common sense and nuanced language understanding. According to the Qwen team, QwQ also tends to enter circular reasoning patterns that do not lead to a conclusive answer and unexpectedly display language mixes and code-switching in its responses. Finally, the team also warns that QwQ requires stronger safety measures to perform reliably and securely and asks users to be cautious when deploying the model.

Subscribe

Qwen with Questions (QwQ) is the latest LLM to leverage the test-time compute technique

Comments

Read Next

NVIDIA unveils new world models and AI infrastructure tailored for robotics applications

Ukraine Launches AI Factory to Secure Digital Sovereignty

ElevenLabs launches an AI music generator that creates full songs from text prompts