Qwen with Questions (QwQ) is the latest LLM to leverage the test-time compute technique
Alibaba's Qwen team recently shared a preview of the latest OpenAI o1 rival. QwQ-32B-Preview leverages the test-time compute technique to deliver enhanced math and coding skills. However, the team notes that QwQ struggles with common sense reasoning and nuanced language understanding.
Alibaba's Qwen team recently unveiled QwQ-32B-Preview, an OpenAI o1 competitor available under an Apache 2.0 license, and the first "reasoning" model to be available under a permissive license (DeepSeek reportedly plans to make its R1 model under a permissive license available soon). According to the Qwen team, QwQ-32B-Preview outperforms OpenAI's o1-preview on the MATH and AIME benchmarks, which draw their problems from mathematics competitions, and whose solutions go beyond high school math, requiring more abstract problem-solving skills and heuristics.
However, like its rivals, QwQ struggles with tasks that require common sense and nuanced language understanding. According to the Qwen team, QwQ also tends to enter circular reasoning patterns that do not lead to a conclusive answer and unexpectedly display language mixes and code-switching in its responses. Finally, the team also warns that QwQ requires stronger safety measures to perform reliably and securely and asks users to be cautious when deploying the model.