Olivier Bachem (@olivierbachem) 's Twitter Profile
Olivier Bachem

@olivierbachem

Director, Research Scientist at @GoogleDeepMind where I lead the research team that post-trains Gemma

ID: 1259188770

linkhttp://olivierbachem.ch calendar_today11-03-2013 11:14:08

546 Tweet

3,3K Takipçi

329 Takip Edilen

Sara Hooker (@sarahookr) 's Twitter Profile Photo

Armand Joulin cohere Thanks Armand Joulin -- we will make a correction. We group all private testing by provider. So while overall number of variants is correct, in this case there is very different testing patterns per model family under a provider. We will clarify gemma only had one private test.

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO

Jean Tarbouriech (@jean_tarbou) 's Twitter Profile Photo

1000+ words per second! ⚡ We just unleashed Gemini Diffusion at #GoogleIO! 🚀 Awesome being part of the team that took this from a small research project all the way to I/O Google DeepMind 🪐

Shantanu Thakoor (@shantanuthakoor) 's Twitter Profile Photo

It's been an incredible experience being part of the team that took this from a small research project all the way to I/O 🪐 Super proud of the team! Google DeepMind

Ivana Balazevic (@ibalazevic) 's Twitter Profile Photo

🚀Meet Gemini Diffusion, our first diffusion-based and super fast language model, just announced at Google I/O!🚀 Very excited to be able to share what I've been working on for the past little while with our amazing small team Google DeepMind.

Alexandre Ramé (@ramealexandre) 's Twitter Profile Photo

Releasing Gemma 3n, our new open-weight model processing audio, images and text (with improved multilingual capabilities), optimized for on-device usage with MatFormer architecture (enabling adaptive compute) and reaching 1283 on Chatbot Arena. Read more: developers.googleblog.com/en/introducing….

Releasing Gemma 3n, our new open-weight model processing audio, images and text (with improved multilingual capabilities), optimized for on-device usage with MatFormer architecture (enabling adaptive compute) and reaching 1283 on Chatbot Arena. Read more: developers.googleblog.com/en/introducing….
George Powell (@thegeorgepowell) 's Twitter Profile Photo

Gemini Diffusion has been announced at #GoogleIO! 🚀 Diffusion for text allows for self correction and incredibly fast inference by generating tokens in parallel over a long horizon. Super proud to have played a small part in making this happen over the last two years.

Edouard Leurent (@eleurent) 's Twitter Profile Photo

Excited to share what I've been up to: Gemini Diffusion is FAST! I'm convinced this will revolutionise iterative workflows: refine, get instant feedback, repeat! So proud of what our small team achieved here🪐

Pier Giuseppe Sessa (@piergsessa) 's Twitter Profile Photo

Gemini Diffusion is out! Very excited to have worked on the post-training of such a state-of-the-art text diffusion model. Incredible performance at lightspeed⚡️ Congrats to everyone involved!!

Aditya Kusupati (@adityakusupati) 's Twitter Profile Photo

Pocket powerhouse admist I/O awesomeness! Gemma 3n E4B & E2B are insane models, optimized for on-device while rivaling frontier models. It's a 🪆Matryoshka Transformer (MatFormer)🪆: Natively elastic b/w 4B & 2B pareto-optimally! ⭐️: free models with ZERO training cost! 🧵👇

Blanca Huergo (@blancahuergo) 's Twitter Profile Photo

Very excited to share what I have been working on. Having been part of the Gemini Diffusion team since day one, it is amazing to see our model demoed at Google I/O :) sign up below to try it out!

Brendan O'Donoghue (@bodonoghue85) 's Twitter Profile Photo

Excited to share what my team has been working on lately - Gemini diffusion! We bring diffusion to language modeling, yielding more power and blazing speeds! 🚀🚀🚀 Gemini diffusion is especially strong at coding. In this example the model generates at 2000 tokens/sec,

Himanshu Sahni (@sahnihim) 's Twitter Profile Photo

Such a privilege to work on Gemini Diffusion with an amazing team! From a small research project to launching at I/O - we've got unstoppable aura 🚀 Welcome to the era of live vibe coding ⚡️

Olivier Bachem (@olivierbachem) 's Twitter Profile Photo

Really proud that two new models have been presented at I/O which we have post-trained: - Gemini Diffusion: with >1k tokens per second a completely new LLM experience deepmind.google/models/gemini-… - Gemma 3n: pushing the boundary of what is possible on mobile developers.googleblog.com/en/introducing…

Robert Dadashi (@robdadashi) 's Twitter Profile Photo

Gemma 3n is out! 🚀🚀🚀 The frontier models from a year ago can now run locally on a phone! Lots of innovations (e.g. matformers, mix’n’match, per layer embeddings) to make this model mobile first. And we finally have audio/video as an input for Gemma models! 1/2

Robert Dadashi (@robdadashi) 's Twitter Profile Photo

Gemma 3n E4B has the same number of total parameters (8B) as the original Gemma 7B (8B lol). The progress of Gemma over the past 16 months is insane