Ankur Bapna (@ankurbpn) Twitter Tweets • TwiCopy

Sai Nemani

7 months ago

Gemini Native Audio is INSANE! It literally made this video. The editing is mine though :) (Also the thumbnail(s) are AI generated) youtu.be/qsabdvDsVXM

thumb_up_off_alt11

chat_bubble_outline1

repeat2

shareShare

Gemini 2.5 Flash Preview now supports native audio output via the Live API for seamless, natural spoken interactions and greater voice control. A new experimental thinking version of this audio model supports reasoning capabilities for more complex tasks. ai.google.dev/gemini-api/doc…

thumb_up_off_alt777

chat_bubble_outline14

repeat139

shareShare

Ankur Bapna

@ankurbpn

7 months ago

Try it out for more natural, conversational and steerable audio responses...

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Zain

@m96949zm

7 months ago

Why is no one talking about how good this Gemini is at TTS? This kinda sounds like the intro to One Piece

thumb_up_off_alt3

chat_bubble_outline0

repeat2

shareShare

Ankur Bapna

@ankurbpn

7 months ago

Try the native audio dialog with thinking 👌

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

ホーダチ-Hodatsu | LLM Researcher × AI Engineer

@hokazuya

7 months ago

Gemini 2.5 Flash Preview Native Audio Dialogが、すごすぎて、笑ってしまったｗこれもっと話題になってもよいんじゃ、と思うレベル。（僕が観測していないだけかもだけど、veoとかよりもはるかに僕はこれがすげぇでげす）

thumb_up_off_alt366

chat_bubble_outline2

repeat54

shareShare

Philipp Schmid

@_philschmid

7 months ago

Building the text-to-speech Agents and Apps with Google DeepMind Gemini 2.5 is super easy! Single API request to generate 5-10 minute long audio in one of 30 voices in 24 languages or with multiple speakers!

Building the text-to-speech Agents and Apps with <a href="/GoogleDeepMind/">Google DeepMind</a> Gemini 2.5 is super easy! Single API request to generate 5-10 minute long audio in one of 30 voices in 24 languages or with multiple speakers!

thumb_up_off_alt147

chat_bubble_outline2

repeat22

shareShare

👩‍💻 Paige Bailey

@dynamicwebpaige

7 months ago

💬 Did you know that the Gemini APIs in @GoogleAIStudio support text-to-speech (TTS)? Even better: it's supported in multiple languages, accents, and tones (including whisper, angry, sad, excited, and more). We even support multiple speakers! 👇Learn more in the docs below:

thumb_up_off_alt63

chat_bubble_outline5

repeat8

shareShare

Nathan Benaich

@nathanbenaich

7 months ago

frontier ai today

thumb_up_off_alt2,2K

chat_bubble_outline40

repeat205

shareShare

👩‍💻 Paige Bailey

@dynamicwebpaige

7 months ago

europeans will say "let's get something quick for a snack" and proceed to randomly select one of four bakeries within line of sight that all sell the best sandwich you've ever eaten, for $4 someone please disrupt the lie that is the american sandwich market

thumb_up_off_alt142

chat_bubble_outline7

repeat6

shareShare

👩‍💻 Paige Bailey

@dynamicwebpaige

7 months ago

🐤 This commercial is using Veo 3 to generate the visuals, Gemini text-to-speech for the voiceover, and MusicFX for the audio. I actually like it better than the original (and it only took 8 minutes to create, not counting the video generation time!):

thumb_up_off_alt29

chat_bubble_outline6

repeat2

shareShare

Google AI Developers

@googleaidevs

7 months ago

🔊Native audio outputs in Gemini 2.5 give developers new ways to build richer applications with conversation and speech. ↓ blog.google/technology/goo…

thumb_up_off_alt890

chat_bubble_outline21

repeat116

shareShare

Google DeepMind

@googledeepmind

7 months ago

Our native audio capabilities are making AI conversations more natural – from understanding tone to generating expressive speech. ✍️🗣️ This could open up new possibilities for how we interact with AI. Developers, try it through Google AI Studio. Learn more. ↓

thumb_up_off_alt901

chat_bubble_outline47

repeat164

shareShare

AshutoshShrivastava

@ai_for_success

7 months ago

Native audio in Google AI Studio is really underrated 🔥🔥

thumb_up_off_alt300

chat_bubble_outline14

repeat38

shareShare

Google

@google

7 months ago

New native audio capabilities in Gemini 2.5 enable text-to-speech in over 24 languages. 🔊Voices are more natural and expressive, and you can seamlessly switch between languages.

thumb_up_off_alt1,1K

chat_bubble_outline85

repeat204

shareShare

Google

@google

7 months ago

Here’s a closer look at what developers can do with Gemini 2.5 native audio capabilities. goo.gle/3Hqj6xG

thumb_up_off_alt169

chat_bubble_outline16

repeat28

shareShare

Sundar Pichai

@sundarpichai

7 months ago

Our latest Gemini 2.5 Pro update is now in preview. It’s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads lmarena.ai with a 24pt Elo score jump since the previous version. We also

thumb_up_off_alt4,4K

chat_bubble_outline214

repeat468

shareShare