Alek Dimitriev (@tensor_rotator) Twitter Tweets • TwiCopy

Alek Dimitriev

@tensor_rotator

+ Follow

Inference @Anthropic, prev Gemini @Google, prev prev PhD @UTAustin

ID: 727974502404083712

linkhttp://alekdimi.github.io calendar_today04-05-2016 21:33:41

377 Tweet

309 Followers

1,1K Following

Kristof

@thuleanfuturist

a month ago

It’s crazy how far Apple has fallen

thumb_up_off_alt17,17K

chat_bubble_outline160

repeat451

shareShare

♰

@reeceollld

a month ago

ughhhhh i didn’t fix my entire life this weekend FUCK

thumb_up_off_alt6,6K

chat_bubble_outline19

repeat1,1K

shareShare

Damin Toell

@damintoell

a month ago

thumb_up_off_alt7,7K

chat_bubble_outline35

repeat430

shareShare

vik

@vikhyatk

a month ago

the lion doesn’t concern himself with numeric stability

thumb_up_off_alt193

chat_bubble_outline10

repeat5

shareShare

John Collison

@collision

a month ago

The SMS number-warming spam has gotten out of control.

thumb_up_off_alt425

chat_bubble_outline35

repeat7

shareShare

Ever wondered what CAN'T be transformed by Transformers? 🪨 I wrote a fun blog post on finding "fixed points" of your LLMs. If you prompt it with a fixed point token, the LLM is gonna decode it repeatedly forever, guaranteed. There's some connection with LLMs' repetition issue.

thumb_up_off_alt735

chat_bubble_outline12

repeat65

shareShare

Stripe Press

@stripepress

a month ago

What is intelligence? What will it take to create AGI? What happens once we succeed? The Scaling Era: An Oral History of AI, 2019–2025 by Dwarkesh Patel and gavin leech is in the bay explores the questions animating those at the frontier of AI research. It’s out today: press.stripe.com/scaling

thumb_up_off_alt673

chat_bubble_outline23

repeat83

shareShare

Dwarkesh Patel

@dwarkesh_sp

a month ago

The Scaling Era is out today. I'm actually surprised with how well this format works. Even better than my expectations. It's so interesting to read side-by-side how hyperscalar CEOs, AI researchers, and economists will answer the same question. Thank you to the Stripe Press

thumb_up_off_alt947

chat_bubble_outline60

repeat62

shareShare

Alek Dimitriev

@tensor_rotator

a month ago

Time for a friends section in my library

thumb_up_off_alt12

chat_bubble_outline1

repeat0

shareShare

Dylan Patel ✈️ ICLR

@dylan522p

a month ago

Today we are launching InferenceMAX! We have support from Nvidia, AMD, OpenAI, Microsoft, Pytorch, SGLang, vLLM, Oracle, CoreWeave, TogetherAI, Nebius, Crusoe, HPE, SuperMicro, Dell It runs every day on the latest software (vLLM, SGLang, etc) across hundreds of GPUs, $10Ms of

thumb_up_off_alt1,1K

chat_bubble_outline110

repeat144

shareShare

Alek Dimitriev

@tensor_rotator

a month ago

This can be entirely explained by easier or harder PRs being given to Claude Code vs Codex?

thumb_up_off_alt16

chat_bubble_outline5

repeat0

shareShare

Mansi Jain

@mansijain_

a month ago

Porsche Macan is LA’s Honda Accord

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

Dwarkesh Patel

@dwarkesh_sp

a month ago

thumb_up_off_alt2,2K

chat_bubble_outline147

repeat32

shareShare

Sham Kakade

@shamkakade6

a month ago

1/8 Second Order Optimizers like SOAP and Muon have shown impressive performance on LLM optimization. But are we fully utilizing the potential of second order information? New work: we show that a full second order optimizer is much better than existing optimizers in terms of