Tara Sainath (@tnsainath) 's Twitter Profile
Tara Sainath

@tnsainath

ID: 216361110

calendar_today16-11-2010 14:12:33

5 Tweet

56 Followers

12 Following

Jeff Dean (@jeffdean) 's Twitter Profile Photo

I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,

I’m very excited to share our work on Gemini today!  Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains.  Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Along with text, images, video and code, Gemini is able to process raw audio signal end-to-end. 🔊 It can listen to and understand speech, making it not only useful for transcription but a model that has a much more nuanced perception of its environment. ↓

Oriol Vinyals (@oriolvinyalsml) 's Twitter Profile Photo

Gemini 1.5 has arrived. Pro 1.5 with 1M tokens available as an experimental feature via AI Studio and Vertex AI in private preview. Then there’s this: In our research, we tested Gemini 1.5 on up to 2M tokens for audio, 2.8M tokens for video, and 🤯10M 🤯 tokens for text. From

Gemini 1.5 has arrived. Pro 1.5 with 1M tokens available as an experimental feature via AI Studio and Vertex AI in private preview. 

Then there’s this: In our research, we tested Gemini 1.5 on up to 2M tokens for audio, 2.8M tokens for video, and 🤯10M 🤯 tokens for text. From
Jeff Dean (@jeffdean) 's Twitter Profile Photo

This video is a glimpse of Project Astra's utility in going about daily life. Remember this door code. What do these funny laundry icons mean on my clothes tag? Will this bus take me where I want to go? Which book will my friend enjoy the most? What can you tell me about this

Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Gemini 2.0 Flash comes with native audio output, and it’s actually wild 🤯 we are working hard to roll this out quickly to more folks!

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

💬 Smarter dialogue: Gemini-powered native audio means Project Astra has better context and customizable accents. 🕹️ Takes action: Computer control lets it open and engage with apps at your direction. 🤝 Personalized help: Integrates with your @Gmail, @GoogleCalendar and more

Tara Sainath (@tnsainath) 's Twitter Profile Photo

check out the new live audio-to-audio dialog model. Native audio with proactivity, affective dialog, tool calling and more.

Sad Albert (@mars53208096) 's Twitter Profile Photo

Do NOT SLEEP on Gemini 2.5's multimodal audio! It is 100 times better than GPT 4o, 50 times less censored and 1000 times better than Grok🐸. Check these examples out of Gemini 2.5's emotional speech capabilities. It does Not have voice cracks and a lot capable and clearer than I

Google AI Developers (@googleaidevs) 's Twitter Profile Photo

See Native Audio in action 🤠🦊 Our "Mumble Jumble" demo in Google AI Studio showcases the Live API's advanced voice capabilities: natural flow, distinct tone, emotion, and multilingual support.

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Our native audio capabilities are making AI conversations more natural – from understanding tone to generating expressive speech. ✍️🗣️ This could open up new possibilities for how we interact with AI. Developers, try it through Google AI Studio. Learn more. ↓