The news, 365 days behind — on purpose Delayed live · replaying 2025

One Year Ago.AI

Remember how fast this is.

29APR2025replayed
one year on
model launchAlibaba · Qwen

Alibaba releases Qwen3, a family of hybrid AI models with open weights

The new model family, ranging from 0.6 billion to 235 billion parameters, introduces hybrid thinking modes and supports 119 languages, challenging closed-source rivals like OpenAI's o-series and DeepSeek's R1.

Draft — dates, figures and quotes not yet verified against sources

Alibaba today released Qwen3, a new family of large language models that the company says can match or surpass top-tier models from OpenAI, Google, and DeepSeek. The lineup includes two Mixture-of-Experts models and six dense models, ranging from 0.6 billion to 235 billion parameters. All models are open-weighted under the Apache 2.0 license, available on Hugging Face, ModelScope, and GitHub.

The flagship Qwen3-235B-A22B achieves competitive results against DeepSeek-R1, OpenAI’s o1 and o3-mini, Grok-3, and Gemini-2.5-Pro on benchmarks for coding, math, and general capabilities. According to the Qwen team, the models support 119 languages and were pre-trained on approximately 36 trillion tokens—double the dataset used for Qwen2.5.

A key feature is hybrid thinking modes: users can toggle between a slow, step-by-step reasoning mode for complex problems and a fast, non-thinking mode for simple queries. The models also support soft switch via /think and /no_think tags in prompts, enabling fine-grained control over the reasoning budget. Alibaba highlights improved agentic capabilities and tool-calling, including Model Context Protocol support.

While the largest MoE model is not yet publicly downloadable, the 32B dense model is available and reportedly outperforms o1 on coding benchmarks like LiveCodeBench. The release intensifies competition in the open-weight AI space, with Chinese labs like Alibaba and DeepSeek pushing the boundaries of what open models can achieve relative to closed counterparts.

T
Tuhin Srivastava

The CEO of Baseten stated that Qwen3 illustrates how open models are keeping pace with closed-source systems, and that despite US chip restrictions, these state-of-the-art open models will be used domestically.

One year later — open only if you can handle spoilers

Qwen3 maintained its reputation as one of the most heavily fine-tuned open model families, though its lead was challenged later in 2025 by Meta's Llama 4 and Mistral's Large series. The hybrid thinking mode design became a common feature in subsequent releases from other labs.

Replay thisPost on XRedditHNLinkedIn