Adrià Recasens (@arecasens) 's Twitter Profile
Adrià Recasens

@arecasens

👨‍💻 Research Scientist @DeepMind
👀🔊 Multimodal
🗣️ Views are on my own

ID: 7318592

calendar_today07-07-2007 21:26:24

4,4K Tweet

1,1K Followers

1,1K Following

Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Gemini 2.0 Flash comes with native audio output, and it’s actually wild 🤯 we are working hard to roll this out quickly to more folks!

Adrià Recasens (@arecasens) 's Twitter Profile Photo

Gemini 2.0 is here! Today we are announcing native audio output (coming soon!). Brilliant team work to make this happen with impressive results -- every single time I watch the video the whispering blows my mind 🤯 we are only getting started 🚀

Alexander Chen (@alexanderchen) 's Twitter Profile Photo

"Say this in a whisper ..." 💬 Native audio output was definitely my favorite Gemini 2.0 demo to make. Being able to steer the voice so expressively with just prompts felt totally new. 🙂

Joost van Amersfoort (@joost_v_amersf) 's Twitter Profile Photo

A very interesting opportunity to work at the intersection of data and scaling. Paul's insights have been crucial to the success of Gemini 2.0 flash (and 1.5 and 1.0 and... 👀). He makes for an excellent mentor/manager! Come help us push the frontier further 🦾.

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Gemini 2.0 Flash Experimental has the ability to produce native audio in a variety of styles and languages - all from scratch. 🗣️ Here’s how this is different to traditional text-to-speech systems ↓ aistudio.google.com/live

Antoine Yang (@antoineyang2) 's Twitter Profile Photo

Gemini 2.0 Flash's video understanding is here 🚀 Think: search in videos via timecodes, extract text from moving camera footage, analyze screen recordings in real-time interactions with native audio out 🔊 Come and try it aistudio.google.com 😀 youtu.be/Mot-JEU26GQ?si…

Adrià Recasens (@arecasens) 's Twitter Profile Photo

Congrats to the #Veo2 team, brilliant work! This is so far my favorite example, combining generation and reasoning 🤯🤯🤯

Alexander Chen (@alexanderchen) 's Twitter Profile Photo

New Gemini 2.0 modalities will enable entirely new interfaces! ✨ that's why I love this early experimentation my teammate Trudy Painter is doing with native audio output in her VoiceCursor prototype. I've been playing with this UI and it really feels like a magical piece of

Oriol Vinyals (@oriolvinyalsml) 's Twitter Profile Photo

Introducing Gemini 2.5 Pro Experimental! 🎉 Our newest Gemini model has stellar performance across math and science benchmarks. It’s an incredible model for coding and complex reasoning, and it’s #1 on the lmarena.ai leaderboard by a drastic 40 ELO margin. Only a handful of

Ankur Bapna (@ankurbpn) 's Twitter Profile Photo

Happy to see the first feature powered by Gemini native audio outputs ship out to public - especially since it's MASSIVELY multilingual. Lots more coming soon 😉

Adrià Recasens (@arecasens) 's Twitter Profile Photo

We are shipping! 🚢🚢🚢 Native audio output is now available in the Live API. Very natural interaction with the option of using Google Search or thinking for more refined answers. Try it out in aistudio.google.com/live and let us know what you think!

Adrià Recasens (@arecasens) 's Twitter Profile Photo

Very nice demo of many of the capabilities available with native audio out, try it yourself in aistudio.google.com/apps/bundled/m…

Google AI Developers (@googleaidevs) 's Twitter Profile Photo

🔊Native audio outputs in Gemini 2.5 give developers new ways to build richer applications with conversation and speech. ↓ blog.google/technology/goo…