mconcat (@monoidconcat) Twitter Tweets • TwiCopy

maharshi

a month ago

update: wrote a triton kernel for this - has correct tiled layout for scale factors - uses inline ptx for conversion - 4x faster than torch compiled version triton is absolutely amazing for writing memory bound kernels tbh

thumb_up_off_alt415

chat_bubble_outline17

repeat19

shareShare

kalomaze

@kalomaze

a month ago

RL LEARNING WITH LORA: A DIVERSE DEEP DIVE

thumb_up_off_alt1,1K

chat_bubble_outline21

repeat91

shareShare

Neel Nanda

@neelnanda5

a month ago

New video: What happened with sparse autoencoders? SAEs were a big craze in mech interp, then suddenly weren't. In this talk, I give the story of SAEs as I experienced it, reflect on mistakes I made, how I think about them now and ways they're over AND under hyped and next steps

thumb_up_off_alt406

chat_bubble_outline7

repeat22

shareShare

Neel Nanda

@neelnanda5

a month ago

Video: youtu.be/Tgq7E4YcPKQ

thumb_up_off_alt81

chat_bubble_outline0

repeat3

shareShare

Nathan Lambert

@natolambert

a month ago

We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big

thumb_up_off_alt1,1K

chat_bubble_outline79

repeat310

shareShare

Aryaman Arora

@aryaman2020

25 days ago

🫡 new paper neurons can be a sparse and interpretable basis for circuit tracing, once you make the right decisions about which neurons and how you circuit trace! i'm excited for how this affects future progress on circuits + automating interp

thumb_up_off_alt191

chat_bubble_outline5

repeat14

shareShare

mconcat

@monoidconcat

13 days ago

Okay finally assembled my AI workstation tower, with one RTX Pro 6000 and two RTX 5090s. Did some test run, satisfied so far.

thumb_up_off_alt5

chat_bubble_outline1

repeat0

shareShare

Kat ⊷ the Poet Engineer

@poetengineer__

13 days ago

playing with fluid particle simulation 🌊🖐️

thumb_up_off_alt6,6K

chat_bubble_outline66

repeat488

shareShare

Kat ⊷ the Poet Engineer

@poetengineer__

8 days ago

a magnetic field simulation between the fingers 🧲🤏

thumb_up_off_alt2,2K

chat_bubble_outline37

repeat201

shareShare

Nathan Odle

@mov_axbx

3 days ago

Getting close to powering up 4x RTX Pro 6000 WS. Using a board with MCIO vs PCIe risers made this build so, so much cleaner than 7x4090. Fourth PSU is for a couple more GPUs.