I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,
Along with text, images, video and code, Gemini is able to process raw audio signal end-to-end. 🔊
It can listen to and understand speech, making it not only useful for transcription but a model that has a much more nuanced perception of its environment. ↓
Gemini 1.5 has arrived. Pro 1.5 with 1M tokens available as an experimental feature via AI Studio and Vertex AI in private preview.
Then there’s this: In our research, we tested Gemini 1.5 on up to 2M tokens for audio, 2.8M tokens for video, and 🤯10M 🤯 tokens for text. From
This video is a glimpse of Project Astra's utility in going about daily life.
Remember this door code. What do these funny laundry icons mean on my clothes tag? Will this bus take me where I want to go? Which book will my friend enjoy the most? What can you tell me about this
💬 Smarter dialogue: Gemini-powered native audio means Project Astra has better context and customizable accents.
🕹️ Takes action: Computer control lets it open and engage with apps at your direction.
🤝 Personalized help: Integrates with your @Gmail, @GoogleCalendar and more
Do NOT SLEEP on Gemini 2.5's multimodal audio! It is 100 times better than GPT 4o, 50 times less censored and 1000 times better than Grok🐸. Check these examples out of Gemini 2.5's emotional speech capabilities. It does Not have voice cracks and a lot capable and clearer than I
See Native Audio in action 🤠🦊 Our "Mumble Jumble" demo in Google AI Studio showcases the Live API's advanced voice capabilities: natural flow, distinct tone, emotion, and multilingual support.
Our native audio capabilities are making AI conversations more natural – from understanding tone to generating expressive speech. ✍️🗣️
This could open up new possibilities for how we interact with AI. Developers, try it through Google AI Studio.
Learn more. ↓