Alex Zhang (@a1zhang) Twitter Tweets • TwiCopy

Alex Zhang

@a1zhang

+ Follow

incoming phd student @MIT_CSAIL, @vant_ai, @princeton ‘24 | 🫵🏻 go participate in the @GPU_MODE kernel competition!!!

ID: 4593727300

linkhttp://alexzhang13.github.io/blog calendar_today24-12-2015 22:30:58

168 Tweet

11,11K Followers

415 Following

hardmaru

@hardmaru

5 months ago

Inference-Time Scaling and Collective Intelligence for Frontier AI sakana.ai/ab-mcts/ We developed AB-MCTS, a new inference-time scaling algorithm that enables multiple frontier AI models to cooperate, achieving promising initial results on the ARC-AGI-2 benchmark.

thumb_up_off_alt535

chat_bubble_outline17

repeat93

shareShare

Ori Press

@ori_press

5 months ago

Do language models have algorithmic creativity? To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️

thumb_up_off_alt141

chat_bubble_outline6

repeat54

shareShare

Alex Zhang

@a1zhang

5 months ago

small life update before the PhD: bittersweet moment but I recently left the awesome folks VantAI & will put my bioml interests on hold for a bit in other news, I’ve joined Sakana AI for the summer! happy to chat abt either :p

thumb_up_off_alt176

chat_bubble_outline16

repeat3

shareShare

Matej Sirovatka

@m_sirovatka

5 months ago

The biggest dataset of human written GPU Code all open-source? 👀 YES Please! We at GPU MODE have released around 40k 🚀 human written code samples spanning Triton, Hip and PyTorch and it's all open on the Hugging Face Hub. Train the new GPT to make GPTs faster ⚡️ Link below ⬇️

thumb_up_off_alt318

chat_bubble_outline3

repeat52

shareShare

Alex Zhang

@a1zhang

5 months ago

BTW this number is only a tiny fraction of what we have planned :p

thumb_up_off_alt36

chat_bubble_outline1

repeat1

shareShare

Alex Zhang

@a1zhang

5 months ago

Does anyone know the differences between nvbench, Triton’s do_bench, and the DeepSeek DeepGEMM’s bench_kineto (calls PyTorch profiler with l2 cache flush)? Just looking to accurately benchmark kernels over a fixed set of shapes (input distribution can vary), also flushing cache.

thumb_up_off_alt23

chat_bubble_outline2

repeat2

shareShare

Alex Zhang

@a1zhang

5 months ago

ATP when I read that a model scored X% overall speedup on a benchmark my brain doesn’t know how to react “AI to optimize X” benchmarks shouldn’t be reported as average improvement over a fixed baseline, it’s super inflated and confusing Are there better alternatives?

thumb_up_off_alt11

chat_bubble_outline2

repeat0

shareShare

Alex Zhang

@a1zhang

5 months ago

Very much a noob question, but for benchmarking CUDA code speed we generally have to clear caches so multiple repeated runs are fair. If I were to benchmark CPU code speed (e.g. on AlgoTune), does a similar principle apply? And how easy is it to do this in say Python?

thumb_up_off_alt25

chat_bubble_outline1

repeat0

shareShare

SWE-bench

@swebench

5 months ago

SWE-agent is now Multimodal! 😎 We're releasing SWE-agent Multimodal, with image-viewing abilities and a full web browser for debugging front-ends. Evaluate your LMs on SWE-bench Multimodal or use it yourself for front-end dev. 🔗➡️

thumb_up_off_alt14

chat_bubble_outline1

repeat6

shareShare

Alex Zhang

@a1zhang

5 months ago

sadly won’t be at ICML but have 2 papers that you should check out! KernelBench which Simon Guo 🦝 will be presenting at the main conference ^_^ + the GPU MODE leaderboard’s OSS infra at the CODEML workshop (7/19) that Matej Sirovatka will be giving an oral for! Lots of 🍿!!

sadly won’t be at ICML but have 2 papers that you should check out!

KernelBench which <a href="/simonguozirui/">Simon Guo 🦝</a> will be presenting at the main conference ^_^

+ the <a href="/GPU_MODE/">GPU MODE</a> leaderboard’s OSS infra at the CODEML workshop (7/19) that <a href="/m_sirovatka/">Matej Sirovatka</a> will be giving an oral for!

Lots of 🍿!!

thumb_up_off_alt50

chat_bubble_outline2

repeat10

shareShare

Alex Zhang

@a1zhang

5 months ago

New GPU MODE x Jane Street 1-day GPU programming hackathon in-person in NYC! Talks by the wonderful Tri Dao, Soumith Chintala, and other PyTorch folks! If you're at #ICML25 check out more information at the Jane Street both! Register by Aug 17: bit.ly/3TS0d9I?r=qr

New <a href="/GPU_MODE/">GPU MODE</a> x Jane Street 1-day GPU programming hackathon in-person in NYC! Talks by the wonderful <a href="/tri_dao/">Tri Dao</a>, <a href="/soumithchintala/">Soumith Chintala</a>, and other PyTorch folks!

If you're at #ICML25 check out more information at the Jane Street both!

Register by Aug 17: bit.ly/3TS0d9I?r=qr

thumb_up_off_alt71

chat_bubble_outline0

repeat6

shareShare

Alex Zhang

@a1zhang

5 months ago

Bro actually denied OpenAI an AlphaGo moment LOL Psyho is him. Huge congrats👏👏

Bro actually denied OpenAI an AlphaGo moment LOL

<a href="/FakePsyho/">Psyho</a> is him. Huge congrats👏👏

thumb_up_off_alt424

chat_bubble_outline8

repeat14

shareShare

Alex Zhang

@a1zhang

5 months ago

If you’re staying for the #ICML2025 workshops, you should definitely go to Matej Sirovatka’s talk today on the infra and design of GPU MODE’s OSS GPU leaderboard. He has a lot of interesting stuff to share :D

If you’re staying for the #ICML2025 workshops, you should definitely go to <a href="/m_sirovatka/">Matej Sirovatka</a>’s talk today on the infra and design of <a href="/GPU_MODE/">GPU MODE</a>’s OSS GPU leaderboard. He has a lot of interesting stuff to share :D

thumb_up_off_alt64

chat_bubble_outline0

repeat7

shareShare