Yaroslav Bulatov (@yaroslavvb) 's Twitter Profile
Yaroslav Bulatov

@yaroslavvb

Together.AI (ex-Google Brain, OpenAI, Meta)
New Blog: medium.com/@yaroslavvb
Old Blog: yaroslavvb.blogspot.com

ID: 258031029

linkhttp://medium.com/@yaroslavvb calendar_today26-02-2011 20:22:57

1,1K Tweet

7,7K Takipçi

873 Takip Edilen

Yaroslav Bulatov (@yaroslavvb) 's Twitter Profile Photo

I did a spot check of my most recent math question I asked on math forums against LLMs Gemini Flash Thinking: 2/2 DeepSeek: 2/2 Claude: 1/2 ChatGPT o1: 0/2 Gemini was the fastest, followed by DeepSeek, followed by human. docs.google.com/document/d/16S…

Yaroslav Bulatov (@yaroslavvb) 's Twitter Profile Photo

Sitting in airport and realizing I don't get how singular values work. Does anyone understand why the two graphs match? mathoverflow.net/questions/4848…

Sitting in airport and realizing I don't get how singular values work.  Does anyone understand why the two graphs match?  mathoverflow.net/questions/4848…
Yaroslav Bulatov (@yaroslavvb) 's Twitter Profile Photo

Fun visualization: take 10x10 random matrix and visualize eigenvalue trajectories you get by rotating this matrix by theta in 0..2 Pi in some fixed direction. Now vary this direction smoothly

Max Ryabinin (@m_ryabinin) 's Twitter Profile Photo

I'm giving a talk at the MCDC🤝 workshop (#ICLR2025) tomorrow! Planning to cover: * An overview of decentralized DL & its links to other fields * Lessons learned from research on Learning@home, DeDLOC, SWARM, Petals * Sneak peek on some of our upcoming work! See you at 14:30!

I'm giving a talk at the MCDC🤝 workshop (#ICLR2025) tomorrow!

Planning to cover:
* An overview of decentralized DL & its links to other fields
* Lessons learned from research on Learning@home, DeDLOC, SWARM, Petals
* Sneak peek on some of our upcoming work!

See you at 14:30!
Yaroslav Bulatov (@yaroslavvb) 's Twitter Profile Photo

Unexpected RMT observation, squared singular values of a product of random projections are essentially distributed as exponentiated chi-squared, can anyone see a direct explanation of this? math.stackexchange.com/questions/5060…

Unexpected RMT observation, squared singular values of a product of random projections are essentially distributed as exponentiated chi-squared, can anyone see a direct explanation of this? math.stackexchange.com/questions/5060…
Yaroslav Bulatov (@yaroslavvb) 's Twitter Profile Photo

Once everyone online is indistinguishable from an AI agent, it would make it cool again to hang out in person. Until the robot impersonators.

Yaroslav Bulatov (@yaroslavvb) 's Twitter Profile Photo

Enjoyed Jeremy Bernstein thought-provoking talk on optimizers at ML Collective today. Are theories that motivate optimizers very useful? Adversarial for AdaGrad, natural gradient for KFAC. Non-linear solvers in scientific computing seem to advance without spending a lot of effort thinking

Yaroslav Bulatov (@yaroslavvb) 's Twitter Profile Photo

This is a nice explanation on why reasoning emerges as an unexpected side effect of training for text compression (but not video compression)