The news, 365 days behind — on purpose Delayed live · replaying 2025

One Year Ago.AI

Remember how fast this is.

12MAR2025replayed
one year on
open weightsGoogle · Google DeepMind

Google releases Gemma 3, claiming it's the most capable open model for a single GPU

The Gemma 3 family, ranging from 1B to 27B parameters, supports 140+ languages, 128k token context, and multimodal input, positioning it as a direct competitor in the open-weights race.

Draft — dates, figures and quotes not yet verified against sources

Google today announced Gemma 3, a collection of open-weight models ranging from 1B to 27B parameters, built on the same research and technology as Gemini 2.0. The company claims the 27B variant outperforms Llama-3-405B, DeepSeek-V3, and o3-mini in human preference evaluations on the LMArena leaderboard, all while fitting on a single GPU or TPU — a claim that, if true, reshapes expectations for what can be done with consumer hardware.

The models support text and visual reasoning (4B, 12B and 27B variants), a 128k-token context window, function calling, and structured output. They come pretrained in over 140 languages and offer official quantized versions. Google also released ShieldGemma 2, a 4B image safety classifier fine-tuned from Gemma 3.

Open-source AI community members are poring over the benchmarks and the model weights, now available on Kaggle and Hugging Face. The ‘most capable model you can run on a single GPU’ line is being tested in real time. With over 100 million downloads of the previous Gemma models and more than 60,000 community variants, the Gemmaverse is growing fast — and the open-weights race just got a new frontrunner.

C
Clement Farabet

VP of Research at Google DeepMind, co-authored the blog post emphasizing Gemma 3's state-of-the-art performance and accessibility.

T
Tris Warkentin

Director at Google DeepMind, co-authored the blog post highlighting the model's multilingual support and safety features.

One year later — open only if you can handle spoilers

Gemma 3 quickly saw wide adoption in edge and mobile applications. A year later, its single-GPU claim remains a benchmark, though later open models like Llama 4 and the Qwen3 series matched or exceeded its performance on consumer hardware.

Replay thisPost on XRedditHNLinkedIn