mike64_t (@mike64_t) Twitter Tweets • TwiCopy

tenderizzation

@tenderizzation

5 months ago

NCCL sending the loss value from the last pipeline parallel stage back to rank 0 so the user can print it

thumb_up_off_alt321

chat_bubble_outline6

repeat20

shareShare

mike64_t

@mike64_t

5 months ago

wellp and there goes the austria doesn’t have school shootings streak. lots of things we can learn from the US, but this ain’t one of them

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

MFA Austria

@mfa_austria

5 months ago

Statement by Foreign Minister Meinl-Reisinger Beate Meinl-Reisinger on the school shooting in #Graz

Statement by Foreign Minister Meinl-Reisinger <a href="/BMeinl/">Beate Meinl-Reisinger</a> on the school shooting in #Graz

thumb_up_off_alt226

chat_bubble_outline44

repeat65

shareShare

It's easy to blame graphics as it's visible, but that's not the reason for modern software slowness. The reason is CPU cache misses, mutex waits and stupid slow code everywhere that nobody knows exists, because they never run a debugger or profiler.

thumb_up_off_alt293

chat_bubble_outline10

repeat18

shareShare

Matías N. Goldberg

@matiasgoldberg

5 months ago

Panos Karabelas Thank you for the inspiration

<a href="/panoskarabelas1/">Panos Karabelas</a> Thank you for the inspiration

thumb_up_off_alt25

chat_bubble_outline0

repeat4

shareShare

tenderizzation

@tenderizzation

5 months ago

primeintellect sending a tensor through PCI-E to host memory, through ethernet/TCP/IP across a continent to another node's host memory, and back down through PCI-E again

thumb_up_off_alt781

chat_bubble_outline22

repeat29

shareShare

mike64_t

@mike64_t

5 months ago

can't even trust battle tested string diffing libraries to not break smh

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

mike64_t

@mike64_t

5 months ago

python3 beep_on_crash.py --pid 1362 gn everyone

thumb_up_off_alt27

chat_bubble_outline5

repeat0

shareShare

mike64_t

@mike64_t

5 months ago

wow, no crash :)

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

TBPN

@tbpn

5 months ago

"You can break AI down into 5 tiers." - George Hotz 🌑 "Data centers - tier 1, fabs - tier 2, Nvidia/AMD - tier 3, OpenAI/Anthropic - tier 4, and completely worthless things like Cursor and Windsurf, which are tier 5." "OpenAI and Anthropic will eat all the value from the

thumb_up_off_alt2,2K

chat_bubble_outline124

repeat155

shareShare

Yann LeCun

@ylecun

5 months ago

It is intuitively obvious that reasoning in continuous embedding space is dramatically more powerful than reasoning in discrete token space. This paper from Yuandong Tian and team show that it is the case theoretically.

thumb_up_off_alt1,1K

chat_bubble_outline83

repeat156

shareShare