Andreas Kirsch 🇺🇦 (@blackhc) 's Twitter Profile
Andreas Kirsch 🇺🇦

@blackhc

My opinions only here.
👨‍🔬 RS @DeepMind, @Midjourney 1y 🧑‍🎓 DPhil @AIMS_oxford @UniofOxford 4.5y 🧙‍♂️ RE DeepMind 1y 📺 SWE @Google 3y 🎓 TUM
👤 @nwspk

ID: 65385274

linkhttp://blackhc.net calendar_today13-08-2009 15:14:48

11,11K Tweet

11,11K Followers

5,5K Following

Jake P. Taylor-King (@wildtypehuman) 's Twitter Profile Photo

"To be clear, no one has proposed using ROCK inhibitors to treat dry AMD in the literature before, as far as we can find..." It's so unprecedented, there's even a review article... sciencedirect.com/science/articl…

Andreas Kirsch 🇺🇦 (@blackhc) 's Twitter Profile Photo

I'm pretty amazed by everything people at GDM have been building ngl 🤯 (I should keep up more internally but IO was a great way to see what's new)

Javi Lopez ⛩️ (@javilopen) 's Twitter Profile Photo

Spaguettis are so cooked. But flamenco is so back! "A dog dressed as a female flamenco dancer dancing flamenco on a tablao in a bar in Seville." 😅😂

Anthropic (@anthropicai) 's Twitter Profile Photo

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4.

Claude Opus 4 is our most powerful model yet, and the world’s best coding model.

Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.
Katie Everett (@_katieeverett) 's Twitter Profile Photo

1. We often observe power laws between loss and compute: loss = a * flops ^ b + c 2. Models are rapidly becoming more efficient, i.e. use less compute to reach the same loss But: which innovations actually change the exponent in the power law (b) vs change only the constant (a)?

You Jiacheng (@youjiacheng) 's Twitter Profile Photo

arxiv.org/abs/2505.16932 everyone told me even SVD won't improve loss so I only spent a bit effort on improving the iters. After reading leloy! 's post last year, I had the intuition that greedy "contraction" will be a good solution, but didn't know it's optimal.

arxiv.org/abs/2505.16932
everyone told me even SVD won't improve loss so I only spent a bit effort on improving the iters.
After reading <a href="/leloykun/">leloy!</a> 's post last year, I had the intuition that greedy "contraction" will be a good solution, but didn't know it's optimal.
Alex Nichol (@unixpickle) 's Twitter Profile Photo

I read through these slides and felt like I was transported back to 2018. Having been in this spot years ago, thinking about what John & team are thinking about, I can't help but feel like they will learn the same lesson I did the hard way.

Maxime Labonne (@maximelabonne) 's Twitter Profile Photo

The French Ministry of Culture released 175k high-quality arena-style preferences It's exactly the type of data LMSYS stopped releasing. They created their own chatbot arena with 55 models and open-sourced everything. Incredible work! 🤗 Dataset: huggingface.co/datasets/minis…

The French Ministry of Culture released 175k high-quality arena-style preferences

It's exactly the type of data LMSYS stopped releasing.

They created their own chatbot arena with 55 models and open-sourced everything. Incredible work!

🤗 Dataset: huggingface.co/datasets/minis…
Sasha Rush (@srush_nlp) 's Twitter Profile Photo

Strong recommend for this book and the JAX/TPU docs, even if you are using Torch / GPUs. Clean notation and mental model for some challenging ideas. github.com/jax-ml/scaling… github.com/jax-ml/scaling… docs.jax.dev/en/latest/note…

Strong recommend for this book and the JAX/TPU docs, even if you are using Torch / GPUs. Clean notation and mental model for some challenging ideas. 

github.com/jax-ml/scaling…
github.com/jax-ml/scaling…
docs.jax.dev/en/latest/note…
Andreas Kirsch 🇺🇦 (@blackhc) 's Twitter Profile Photo

Hearing a lot of supposedly smart people complain about Anthropic averaging pass@1 over 10 trials and "giving themselves an advantage", but has no one heard of regression to the mean?

Yann LeCun (@ylecun) 's Twitter Profile Photo

Steven Pinker injects some facts and much-needed sanity in the debates around Harvard and American academia. nytimes.com/2025/05/23/opi…