Alex Zhang (@a1zhang) 's Twitter Profile
Alex Zhang

@a1zhang

incoming phd student @MIT_CSAIL, @vant_ai, @princeton ‘24 | 🫵🏻 go participate in the @GPU_MODE kernel competition!!!

ID: 4593727300

linkhttp://alexzhang13.github.io/blog calendar_today24-12-2015 22:30:58

168 Tweet

11,11K Followers

415 Following

hardmaru (@hardmaru) 's Twitter Profile Photo

Inference-Time Scaling and Collective Intelligence for Frontier AI sakana.ai/ab-mcts/ We developed AB-MCTS, a new inference-time scaling algorithm that enables multiple frontier AI models to cooperate, achieving promising initial results on the ARC-AGI-2 benchmark.

Ori Press (@ori_press) 's Twitter Profile Photo

Do language models have algorithmic creativity? To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️

Do language models have algorithmic creativity?

To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️
Alex Zhang (@a1zhang) 's Twitter Profile Photo

small life update before the PhD: bittersweet moment but I recently left the awesome folks VantAI & will put my bioml interests on hold for a bit in other news, I’ve joined Sakana AI for the summer! happy to chat abt either :p

Matej Sirovatka (@m_sirovatka) 's Twitter Profile Photo

The biggest dataset of human written GPU Code all open-source? 👀 YES Please! We at GPU MODE have released around 40k 🚀 human written code samples spanning Triton, Hip and PyTorch and it's all open on the Hugging Face Hub. Train the new GPT to make GPTs faster ⚡️ Link below ⬇️

Alex Zhang (@a1zhang) 's Twitter Profile Photo

Does anyone know the differences between nvbench, Triton’s do_bench, and the DeepSeek DeepGEMM’s bench_kineto (calls PyTorch profiler with l2 cache flush)? Just looking to accurately benchmark kernels over a fixed set of shapes (input distribution can vary), also flushing cache.

Alex Zhang (@a1zhang) 's Twitter Profile Photo

ATP when I read that a model scored X% overall speedup on a benchmark my brain doesn’t know how to react “AI to optimize X” benchmarks shouldn’t be reported as average improvement over a fixed baseline, it’s super inflated and confusing Are there better alternatives?

Alex Zhang (@a1zhang) 's Twitter Profile Photo

Very much a noob question, but for benchmarking CUDA code speed we generally have to clear caches so multiple repeated runs are fair. If I were to benchmark CPU code speed (e.g. on AlgoTune), does a similar principle apply? And how easy is it to do this in say Python?

Alex Zhang (@a1zhang) 's Twitter Profile Photo

sadly won’t be at ICML but have 2 papers that you should check out! KernelBench which Simon Guo 🦝 will be presenting at the main conference ^_^ + the GPU MODE leaderboard’s OSS infra at the CODEML workshop (7/19) that Matej Sirovatka will be giving an oral for! Lots of 🍿!!

sadly won’t be at ICML but have 2 papers that you should check out!

KernelBench which <a href="/simonguozirui/">Simon Guo 🦝</a>  will be presenting at the main conference ^_^

+ the <a href="/GPU_MODE/">GPU MODE</a> leaderboard’s OSS infra at the CODEML workshop (7/19) that <a href="/m_sirovatka/">Matej Sirovatka</a> will be giving an oral for!

Lots of 🍿!!
Alex Zhang (@a1zhang) 's Twitter Profile Photo

New GPU MODE x Jane Street 1-day GPU programming hackathon in-person in NYC! Talks by the wonderful Tri Dao, Soumith Chintala, and other PyTorch folks! If you're at #ICML25 check out more information at the Jane Street both! Register by Aug 17: bit.ly/3TS0d9I?r=qr

New <a href="/GPU_MODE/">GPU MODE</a> x Jane Street 1-day GPU programming hackathon in-person in NYC! Talks by the wonderful <a href="/tri_dao/">Tri Dao</a>, <a href="/soumithchintala/">Soumith Chintala</a>, and other PyTorch folks!

If you're at #ICML25 check out more information at the Jane Street both!

Register by Aug 17: bit.ly/3TS0d9I?r=qr
Alex Zhang (@a1zhang) 's Twitter Profile Photo

If you’re staying for the #ICML2025 workshops, you should definitely go to Matej Sirovatka’s talk today on the infra and design of GPU MODE’s OSS GPU leaderboard. He has a lot of interesting stuff to share :D

If you’re staying for the #ICML2025 workshops, you should definitely go to <a href="/m_sirovatka/">Matej Sirovatka</a>’s talk today on the infra and design of <a href="/GPU_MODE/">GPU MODE</a>’s OSS GPU leaderboard. He has a lot of interesting stuff to share :D