Sasha Rush (@srush_nlp) 's Twitter Profile
Sasha Rush

@srush_nlp

Professor, Programmer in NYC.
Cornell Tech, Hugging Face 🤗
youtube.com/@srush_nlp

ID: 4558314927

linkhttp://rush-nlp.com calendar_today21-12-2015 15:46:59

6,6K Tweet

56,56K Followers

474 Following

Sasha Rush (@srush_nlp) 's Twitter Profile Photo

This new ZML library has an interesting take on first-class tensor names const q = q_.withTags(.{ .b, .h, .q, .hd }); const k = k_.withTags(.{ .b, .h, .k, .hd }); const v = v_.withTags(.{ .b, .h, .k, .hd }); I don't know enough about Zig, but neat people are trying new things.

Horace He (@chhillee) 's Twitter Profile Photo

I'll be at Triton conference Tuesday, PyTorch conference Wednesday and Thursday, and CUDA Mode IRL on Saturday! DM me if you want to chat. Happy to talk about anything related to PyTorch or MLsys in general. I've quite enjoyed chatting to new people at past conferences :)

Anjalie Field (@anjalie_f) 's Twitter Profile Photo

As application season rolls around again, here's your reminder that materials from my successful applications are available on my website (NSF-GRFP, Google PhD Fellowship, Stanford data science postdoc, and CS faculty job search): anjalief.github.io/statements.html

Rupesh Srivastava (@rupspace) 's Twitter Profile Photo

Interested in Discrete Diffusion? I've just released a Github repo where you can learn about and play with discrete diffusion algorithms with simple and performant "nano-style" implementations. (link below) I've started with the Absorbing D3PM from Jacob Austin and

Miles Brundage (@miles_brundage) 's Twitter Profile Photo

OpenAI: o1 uses a long chain of thought. Here are several examples People who wrote papers on different things: I think they’re using my thing

Pranav Rajpurkar (@pranavrajpurkar) 's Twitter Profile Photo

Thanks! Foresight of Percy Liang to bet on the opportunity and the timeline over which the benchmark could be tackled. We first discussed the creation of the benchmark at NeurIPS 2015, and started building right after.

Sasha Rush (@srush_nlp) 's Twitter Profile Photo

I would pay so much for a Cursor- like interface that could see my Gmail. We have the technology to help me sort by anxiety-level, and we're wasting it on coding.

Nathan Lambert (@natolambert) 's Twitter Profile Photo

Things of note (not that much) in this longer o1 video: 1. “Model with RL is better at finding new CoT steps than humans” 2. “Emergence of self critique was a powerful moment” 3. Mentioned a literal timeout for the model, and the model was like “aha I got it” but maybe a