Adam Sadovsky (@asadovsky) Twitter Tweets • TwiCopy

Adam Sadovsky

@asadovsky

a year ago

🚀🚀🚀

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Adam Sadovsky

@asadovsky

a year ago

the bitter lesson's sibling

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Our latest update to our Gemini 2.0 Flash Thinking model (available here: goo.gle/4jsCqZC) scores 73.3% on AIME (math) & 74.2% on GPQA Diamond (science) benchmarks. Thanks for all your feedback, this represents super fast progress from our first release just this past

thumb_up_off_alt2,2K

chat_bubble_outline127

repeat364

shareShare

Raveesh Bhalla

@raveeshbhalla

10 months ago

Deedy Sholto Douglas As swyx 📍 @aiDotEngineer’s Pareto frontier graph shows, Gemini is arguably the real story of the past few months

<a href="/deedydas/">Deedy</a> <a href="/_sholtodouglas/">Sholto Douglas</a> As <a href="/swyx/">swyx 📍 @aiDotEngineer</a>’s Pareto frontier graph shows, Gemini is arguably the real story of the past few months

thumb_up_off_alt593

chat_bubble_outline26

repeat76

shareShare

Andrej Karpathy

@karpathy

10 months ago

We have to take the LLMs to school. When you open any textbook, you'll see three major types of information: 1. Background information / exposition. The meat of the textbook that explains concepts. As you attend over it, your brain is training on that data. This is equivalent

thumb_up_off_alt12,12K

chat_bubble_outline365

repeat1,1K

shareShare

Farzad Mostashari

@farzad_md

10 months ago

1/ After residency at Mass General Hospital, I reported to Atlanta to meet my fellow CDC Epidemic Intelligence Service Officers. I have never felt so intimidated by my peers The best and the brightest, they were star clinicians, had served in disaster zones; MD/PhDs and MSF.

thumb_up_off_alt18,18K

chat_bubble_outline418

repeat5,5K

shareShare

Subhash Choudhary

@subhashchy

10 months ago

We replaced GPT-4o with Gemini-2.0 Flash for Bot9, reducing our costs by about 20× with no visible loss in accuracy. This change was implemented on a highly complex support agent that makes 32 tool calls. I was seriously not expecting this. At the application layer, it also

thumb_up_off_alt1,1K

chat_bubble_outline40

repeat68

shareShare

Adam Sadovsky

@asadovsky

9 months ago

Wow, quite impressive for a 27B model!

thumb_up_off_alt53

chat_bubble_outline0

repeat0

shareShare

Kyle Corbitt

@corbtt

9 months ago

If you're fine-tuning LLMs, Gemma 3 is the new 👑 and it's not close. Gemma 3 trounces Qwen/Llama models at every size! - Gemma 3 4B beats 7B/8B competition - Gemma 3 27B matches 70B competiton Vision benchmarks coming soon!

thumb_up_off_alt491

chat_bubble_outline18

repeat52

shareShare

Adam Sadovsky

@asadovsky

9 months ago

cook or die

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

9 months ago

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆 Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer

thumb_up_off_alt2,2K

chat_bubble_outline75

repeat421

shareShare

Bindu Reddy

@bindureddy

8 months ago

WE HAVE A NEW BEST MODEL IN THE WORLD! GEMINI 2.5 IS #1 ON LIVEBENCH

thumb_up_off_alt1,1K

chat_bubble_outline111

repeat179

shareShare

Adam Sadovsky

@asadovsky

8 months ago

Gemini 2.5 Pro is SOTA on pretty much everything

thumb_up_off_alt346

chat_bubble_outline8

repeat20

shareShare

Martin Baeuml

@mbaeuml

8 months ago

Just shipped a few updates 1. Gemini 2.5 Pro to try for free on gemini.google.com in the model drop down. Advanced has higher limits. 2. Canvas with 2.5 Pro in Advanced. Our best coding model yet. We had so much fun building demos internally, can't wait to see what y'all

thumb_up_off_alt397

chat_bubble_outline12

repeat17

shareShare

Adam Sadovsky

@asadovsky

8 months ago

Quite interesting to see how some models generalize dramatically better!

thumb_up_off_alt44

chat_bubble_outline0

repeat2

shareShare

Adam Sadovsky

@asadovsky

8 months ago

SOTA just got way cheaper

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Adam Sadovsky

@asadovsky

8 months ago

Interesting

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Adam Sadovsky

@asadovsky

3 months ago

hello, world!

thumb_up_off_alt113

chat_bubble_outline4

repeat5

shareShare

Nando de Freitas

@nandodf

3 months ago

This was an amazing week at ⁦Microsoft AI⁩ !! We released MAI 1 preview and a taste of MAI Voice. I’m super happy with this team - only about 100 people and already shipping in ⁦Arena⁩ in less than a year. Strong support. More soon. Thanks for feedback!

thumb_up_off_alt168

chat_bubble_outline14

repeat9

shareShare

Mustafa Suleyman

@mustafasuleyman

2 months ago

Meet our third Microsoft AI model: MAI-Image-1 #9 on LMArena, striking an impressive balance of generation speed and quality Excited to keep refining + climbing the leaderboard from here! We're just getting started. microsoft.ai/news/introduci…

Meet our third <a href="/MicrosoftAI/">Microsoft AI</a> model: MAI-Image-1
#9 on LMArena, striking an impressive balance of generation speed and quality
Excited to keep refining + climbing the leaderboard from here!
We're just getting started.
microsoft.ai/news/introduci…

thumb_up_off_alt508

chat_bubble_outline36

repeat84

shareShare

Adam Sadovsky

Adam Sadovsky

Adam Sadovsky

Demis Hassabis

Raveesh Bhalla

Andrej Karpathy

Farzad Mostashari

Subhash Choudhary

Adam Sadovsky

Kyle Corbitt

Adam Sadovsky

lmarena.ai (formerly lmsys.org)

Bindu Reddy

Adam Sadovsky

Martin Baeuml

Adam Sadovsky

Adam Sadovsky

Adam Sadovsky

Adam Sadovsky

Nando de Freitas

Mustafa Suleyman