Vlad Feinberg (@feinbergvlad) Twitter Tweets • TwiCopy

Raveesh Bhalla

10 months ago

Deedy Sholto Douglas As swyx 📍 @aiDotEngineer’s Pareto frontier graph shows, Gemini is arguably the real story of the past few months

<a href="/deedydas/">Deedy</a> <a href="/_sholtodouglas/">Sholto Douglas</a> As <a href="/swyx/">swyx 📍 @aiDotEngineer</a>’s Pareto frontier graph shows, Gemini is arguably the real story of the past few months

thumb_up_off_alt593

chat_bubble_outline26

repeat76

shareShare

Dan Mac

@daniel_mac8

10 months ago

everyone comparing deepseek-r1 to o1 and forgetting about Gemini 2 Flash Thinking which is better than r1 on every cost and performance metric

thumb_up_off_alt3,3K

chat_bubble_outline282

repeat290

shareShare

The progress with our Gemini reasoning models is actually wild, we are in the GPT-2 era of scaling reasoning! The main delta is that the models are actually super useful in their current form and getting better week over week. The future is exciting...

thumb_up_off_alt2,2K

chat_bubble_outline170

repeat134

shareShare

Jacob Austin

@jacobaustin132

10 months ago

Making LLMs run efficiently can feel scary, but scaling isn’t magic, it’s math! We wanted to demystify the “systems view” of LLMs and wrote a little textbook called “How To Scale Your Model” which we’re releasing today. 1/n

thumb_up_off_alt1,1K

chat_bubble_outline25

repeat377

shareShare

Advait Bopardikar

@advaitonline

10 months ago

It's been a week and Gemini 2.0 Flash has already overtaken one of the Sonnet endpoints on OpenRouter for daily usage. A lot of it is coming from Coding use. Can't beat the price: performance ratio. ♊⚡📈

It's been a week and Gemini 2.0 Flash has already overtaken one of the Sonnet endpoints on <a href="/OpenRouterAI/">OpenRouter</a> for daily usage. A lot of it is coming from Coding use. Can't beat the price: performance ratio. ♊⚡📈

thumb_up_off_alt104

chat_bubble_outline5

repeat3

shareShare

Elad Hazan

@hazanprinceton

9 months ago

Our team at GDM Princeton is hiring! if you want to work on theoretically founded next-gen architectures for LLM, please apply here: sites.google.com/view/gbrainpri…

thumb_up_off_alt163

chat_bubble_outline5

repeat29

shareShare

Lisan al Gaib

@scaling01

9 months ago

checkmate

thumb_up_off_alt2,2K

chat_bubble_outline28

repeat104

shareShare

Shawn

@shawnryan96

9 months ago

Gemini flash 2.0 experimental is the first model I feel that really generalizes over different modalities. It also feels like real reasoning even when it gets it wrong. It seems to think outside the box in some cases.

thumb_up_off_alt209

chat_bubble_outline14

repeat4

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

8 months ago

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆 Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer

thumb_up_off_alt2,2K

chat_bubble_outline75

repeat421

shareShare

koray kavukcuoglu

@koraykv

8 months ago

1/ Today we are releasing Gemini 2.5 Pro Experimental, our newest Gemini model with integrated “thinking” and significant performance gains. Very proud of the whole team! 🧵

thumb_up_off_alt466

chat_bubble_outline9

repeat21

shareShare

Oriol Vinyals

@oriolvinyalsml

8 months ago

Introducing Gemini 2.5 Pro Experimental! 🎉 Our newest Gemini model has stellar performance across math and science benchmarks. It’s an incredible model for coding and complex reasoning, and it’s #1 on the lmarena.ai leaderboard by a drastic 40 ELO margin. Only a handful of

thumb_up_off_alt1,1K

chat_bubble_outline53

repeat150

shareShare

Vlad Feinberg

@feinbergvlad

8 months ago

#2 only to 2.5 Pro :) Another amazing collab across the board! A special thank you to my awesome team Arnaud Autef Arun Ahuja Geng Yan who were instrumental in getting this pretrained! So many more people I need to list here who helped across the stack---too many to tweet!

thumb_up_off_alt81

chat_bubble_outline4

repeat7

shareShare

Dillon Uzar

@dillonuzar

8 months ago

Another update - Ran Gemini 2.5 Flash (Auto Thinking and Non-Thinking). See the comparison below to other thinking models. Interesting curve for Gemini 2.5 Flash Non-Thinking! Meanwhile Gemini 2.5 Flash Thinking (Auto) matches Gemini 2.5 Pro! I'm still working on o3 access and

thumb_up_off_alt57

chat_bubble_outline1

repeat7

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

7 months ago

🚨Breaking: Google DeepMind’s latest Gemini-2.5-Pro is now ranked #1 across all LMArena leaderboards 🏆 Highlights: - #1 in all text arenas (Coding, Style Control, Creative Writing, etc) - #1 on the Vision leaderboard with a ~70 pts lead! - #1 on WebDev Arena, surpassing Claude

🚨Breaking: <a href="/GoogleDeepMind/">Google DeepMind</a>’s latest Gemini-2.5-Pro is now ranked #1 across all LMArena leaderboards 🏆

Highlights:
- #1 in all text arenas (Coding, Style Control, Creative Writing, etc)
- #1 on the Vision leaderboard with a ~70 pts lead!
- #1 on WebDev Arena, surpassing Claude

thumb_up_off_alt1,1K

chat_bubble_outline35

repeat213

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

7 months ago

📢We’re excited to share that we’ve raised $100M in seed funding to support LMArena and continue our research on reliable AI. Led by a16z and UC Investments (University of California), we're proud to have the support of those that believe in both the science and the mission. We’re

thumb_up_off_alt796

chat_bubble_outline62

repeat85

shareShare

Jack Rae

@jack_w_rae

7 months ago

There was a lot of announcements at IO, easy to overlook the new 2.5 Flash. It's pushing new boundaries in capability vs speed!

thumb_up_off_alt151

chat_bubble_outline6

repeat16

shareShare

Melvin Johnson

@melvinjohnsonp

7 months ago

Great to see 2.5 Flash improve in their utility for both the reasoning and non-reasoning slices. It's an incredible model for most use cases. We're excited to see what you all build with it.

thumb_up_off_alt50

chat_bubble_outline1

repeat4

shareShare

Vlad Feinberg

Raveesh Bhalla

Dan Mac

Logan Kilpatrick

Jacob Austin

Advait Bopardikar

Elad Hazan

Lisan al Gaib

Shawn

lmarena.ai (formerly lmsys.org)

koray kavukcuoglu

Oriol Vinyals

Vlad Feinberg

Dillon Uzar

lmarena.ai (formerly lmsys.org)

lmarena.ai (formerly lmsys.org)

Jack Rae

Melvin Johnson