
Ankur Bapna
@ankurbpn
Native Audio in Gemini @GoogleDeepmind
ID: 2330682822
06-02-2014 18:33:56
433 Tweet
855 Takipçi
634 Takip Edilen


Gemini 2.5 Flash Preview now supports native audio output via the Live API for seamless, natural spoken interactions and greater voice control. A new experimental thinking version of this audio model supports reasoning capabilities for more complex tasks. ai.google.dev/gemini-api/doc…




Building the text-to-speech Agents and Apps with Google DeepMind Gemini 2.5 is super easy! Single API request to generate 5-10 minute long audio in one of 30 voices in 24 languages or with multiple speakers!











Our latest Gemini 2.5 Pro update is now in preview. It’s better at coding, reasoning, science + math, shows improved performance across key benchmarks (AIDER Polyglot, GPQA, HLE to name a few), and leads lmarena.ai with a 24pt Elo score jump since the previous version. We also
