Sasha Rush (@srush_nlp) 's Twitter Profile
Sasha Rush

@srush_nlp

Programmer, professor, currently in the bay area
youtube.com/@srush_nlp

ID: 4558314927

linkhttp://rush-nlp.com calendar_today21-12-2015 15:46:59

7,7K Tweet

66,66K Takipçi

473 Takip Edilen

Christina Baek (@_christinabaek) 's Twitter Profile Photo

Are current reasoning models optimal for test-time scaling? 🌠 No! Models make the same incorrect guess over and over again. We show that you can fix this problem w/o any crazy tricks 💫 – just do weight ensembling (WiSE-FT) for big gains on math! 1/N

Are current reasoning models optimal for test-time scaling? 🌠
No! Models make the same incorrect guess over and over again.

We show that you can fix this problem w/o any crazy tricks 💫 – just do weight ensembling (WiSE-FT) for big gains on math!

1/N
Assaf Ben Kish (@abk_tau) 's Twitter Profile Photo

New work! 🚨 Recurrent LLMs like Mamba and RWKV can efficiently process millions of tokens, yet still underperform on real-world long-context tasks. What's holding them back? 🤔 And how can a lightweight fix boost their performance by 35% on LongBench? 👇🏼🧵 Github:

Wenting Zhao (@wzhao_nlp) 's Twitter Profile Photo

Some personal news: I'll join UMass Amherst CS as an assistant professor in fall 2026. Until then, I'll postdoc at Meta nyc. Reasoning will continue to be my main interest, with a focus on data-centric approaches🤩 If you're also interested, apply to me (phds & a postdoc)!

Maithra Raghu (@maithra_raghu) 's Twitter Profile Photo

🚀 Thrilled to share that @SamayaAI has raised $43.5M in funding led by NEA to build Expert AI Agents for financial services and transform knowledge work at scale. We started Samaya in 2022 — before ChatGPT — with a belief: 💡 AI could revolutionize sophisticated financial

🚀 Thrilled to share that @SamayaAI has raised $43.5M in funding led by <a href="/NEA/">NEA</a> to build Expert AI Agents for financial services and transform knowledge work at scale.

We started Samaya in 2022 — before ChatGPT — with a belief:
 💡 AI could revolutionize sophisticated financial
Percy Liang (@percyliang) 's Twitter Profile Photo

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:
Google AI Developers (@googleaidevs) 's Twitter Profile Photo

2️⃣SignGemma is a sign language understanding model that’s coming later this year 🤟🏼It’s a massively multilingual model that’s best at translating ASL into English text, enabling further development of tech access for Deaf and Hard of Hearing users. 🧏 Share your feedback and

Ferenc Huszár (@fhuszar) 's Twitter Profile Photo

A new post with intuitions behind continuous-time Markov chains, a building block of diffusion language models, like Inception Labs 's Mercury and Gemini Diffusion. Touches on different perspectives on Markov chains, connections to point processes + more. inference.vc/discrete-diffu…

Aman Sanger (@amanrsanger) 's Twitter Profile Photo

Claude Sonnet 4 is much better at codebase understanding. Paired with recent improvements in Cursor, it's SOTA on large codebases

Claude Sonnet 4 is much better at codebase understanding.

Paired with recent improvements in Cursor, it's SOTA on large codebases
Dan Biderman (@dan_biderman) 's Twitter Profile Photo

Part of my speech at the Columbia Medical Center PhD hooding ceremony: “The world we’re graduating into now has machines that review literature, write code, analyze data, and even draft manuscripts that pass peer review. Our scientific profession is shifting beneath our feet.

Part of my speech at the Columbia Medical Center PhD hooding ceremony:

“The world we’re graduating into now has machines that review literature, write code, analyze data, and even draft manuscripts that pass peer review. Our scientific profession is shifting beneath our feet.
Patrick Kidger (@patrickkidger) 's Twitter Profile Photo

Sasha Rush "A Computer Scientist's Guide to Cell Biology", by Cohen & Cohen, if you want something adjacent to CS but still awesome science. It's a wonderfully readable 100 pages: link.springer.com/book/10.1007/9…

Sasha Rush (@srush_nlp) 's Twitter Profile Photo

Strong recommend for this book and the JAX/TPU docs, even if you are using Torch / GPUs. Clean notation and mental model for some challenging ideas. github.com/jax-ml/scaling… github.com/jax-ml/scaling… docs.jax.dev/en/latest/note…

Strong recommend for this book and the JAX/TPU docs, even if you are using Torch / GPUs. Clean notation and mental model for some challenging ideas. 

github.com/jax-ml/scaling…
github.com/jax-ml/scaling…
docs.jax.dev/en/latest/note…
Sasha Rush (@srush_nlp) 's Twitter Profile Photo

Been reflecting a bit on the Harvard news. This paper from 2017 was one of my favorite projects, and has been a joy to see all the things the authors have gone on to do. Didn't realize at the time how lucky for us Americans to work with incredible people from around the world.

Been reflecting a bit on the Harvard news. 

This paper from 2017 was one of my favorite projects, and has been a joy to see all the things the authors have gone on to do. Didn't realize at the time how lucky for us Americans to work with incredible people from around the world.