Benjamin F Spector (@bfspector) 's Twitter Profile
Benjamin F Spector

@bfspector

stanford cs phd student. i make ml go brr.

ID: 1313966496549351425

linkhttp://benjaminfspector.com calendar_today07-10-2020 22:16:48

87 Tweet

2,2K Followers

130 Following

Dan Fu (@realdanfu) 's Twitter Profile Photo

A little pre-GTC present for everyone... new Blackwell kernels, all written in ThunderKittens! ⚡️🐱 BF16 & FP8 GEMMs, attention forwards & backwards - fast (competitive with cuDNN and cuBLAS) and open-source! w/ Benjamin F Spector Aaryan Singhal hazyresearch Together AI 1/

A little pre-GTC present for everyone... new Blackwell kernels, all written in ThunderKittens! ⚡️🐱

BF16 & FP8 GEMMs, attention forwards & backwards - fast (competitive with cuDNN and cuBLAS) and open-source!

w/ <a href="/bfspector/">Benjamin F Spector</a> <a href="/AaryanSinghal4/">Aaryan Singhal</a> <a href="/HazyResearch/">hazyresearch</a> <a href="/togethercompute/">Together AI</a> 1/
Tanishq Kumar (@tanishqkumar07) 's Twitter Profile Photo

trained a nanoGPT? feeling behind before o4-mini? 🚨🚨i'm open-sourcing beyond-nanoGPT, an internal codebase to help people go from LLM basics to research-level understanding. 🚨🚨 it contains thousands of lines of from-scratch, annotated pytorch implementing advanced

trained a nanoGPT? feeling behind before o4-mini?

🚨🚨i'm open-sourcing beyond-nanoGPT, an internal codebase to help people go from LLM basics to research-level understanding. 🚨🚨

it contains thousands of lines of from-scratch, annotated pytorch implementing advanced
Jonathan Jacobi (@j0nathanj) 's Twitter Profile Photo

Introducing Multiverse: the first AI-generated multiplayer game. Multiplayer was the missing piece in AI-generated worlds — now it’s here. Players can interact and shape a shared AI-simulated world, in real-time. Training and research cost < $1.5K. Run it on your own PC. We

Jordan Juravsky (@jordanjuravsky) 's Twitter Profile Photo

We wrote a megakernel! Excited to share how we fused Llama-1B into a single kernel to reach SOTA latency. Check out our blog post and code below!

ollama (@ollama) 's Twitter Profile Photo

3 months ago, Stanford's Hazy Research lab introduced Minions, a project that connects Ollama to frontier cloud models to reduce cloud costs by 5-30x while achieving 98% of frontier model accuracy. Secure Minion turns an H100 into a secure enclave, where all memory and

3 months ago, Stanford's Hazy Research lab introduced Minions, a project that connects Ollama to frontier cloud models to reduce cloud costs by 5-30x while achieving 98% of frontier model accuracy. 

Secure Minion turns an H100 into a secure enclave, where all memory and
Jordan Juravsky (@jordanjuravsky) 's Twitter Profile Photo

Happy Throughput Thursday! We’re excited to release Tokasaurus: an LLM inference engine designed from the ground up for high-throughput workloads with large and small models. (Joint work with Ayush Chakravarthy, Ryan Ehrlich, Sabri Eyuboglu, Bradley Brown, Joseph Shetaye,

Happy Throughput Thursday! We’re excited to release Tokasaurus: an LLM inference engine designed from the ground up for high-throughput workloads with large and small models.

(Joint work with <a href="/achakravarthy01/">Ayush Chakravarthy</a>, <a href="/ryansehrlich/">Ryan Ehrlich</a>, <a href="/EyubogluSabri/">Sabri Eyuboglu</a>, <a href="/brad19brown/">Bradley Brown</a>, <a href="/jshetaye/">Joseph Shetaye</a>,
Jerry Liu (@jerrywliu) 's Twitter Profile Photo

1/10 ML can solve PDEs – but precision🔬is still a challenge. Towards high-precision methods for scientific problems, we introduce BWLer 🎳, a new architecture for physics-informed learning achieving (near-)machine-precision (up to 10⁻¹² RMSE) on benchmark PDEs. 🧵How it works:

Decart (@decartai) 's Twitter Profile Photo

Introducing MirageLSD: The First Live-Stream Diffusion (LSD) AI Model Input any video stream, from a camera or video chat to a computer screen or game, and transform it into any world you desire, in real-time (<40ms latency). Here’s how it works (w/ demo you can use!):

Anna Monaco (@annarmonaco) 's Twitter Profile Photo

Paradigm is the AI-native spreadsheet to eliminate menial work. Thousands of users have saved 10,000+ hours with Paradigm, and you can be next. Get your first month free today, then plans start at just $20/month.

Stuart Sul (@stuart_sul) 's Twitter Profile Photo

MoE layers can be really slow. When training our coding models Cursor, they ate up 27–53% of training time. So we completely rebuilt it at the kernel level and transitioned to MXFP8. The result: 3.5x faster MoE layer and 1.5x end-to-end training speedup. We believe our

MoE layers can be really slow. When training our coding models <a href="/cursor_ai/">Cursor</a>, they ate up 27–53% of training time.

So we completely rebuilt it at the kernel level and transitioned to MXFP8. The result: 3.5x faster MoE layer and 1.5x end-to-end training speedup.

We believe our