mconcat (@monoidconcat) 's Twitter Profile
mconcat

@monoidconcat

Modernist

ID: 1275335760451796992

calendar_today23-06-2020 07:52:40

3,3K Tweet

274 Takipçi

212 Takip Edilen

maharshi (@mrsiipa) 's Twitter Profile Photo

update: wrote a triton kernel for this - has correct tiled layout for scale factors - uses inline ptx for conversion - 4x faster than torch compiled version triton is absolutely amazing for writing memory bound kernels tbh

Neel Nanda (@neelnanda5) 's Twitter Profile Photo

New video: What happened with sparse autoencoders? SAEs were a big craze in mech interp, then suddenly weren't. In this talk, I give the story of SAEs as I experienced it, reflect on mistakes I made, how I think about them now and ways they're over AND under hyped and next steps

Nathan Lambert (@natolambert) 's Twitter Profile Photo

We present Olmo 3, our next family of fully open, leading language models. This family of 7B and 32B models represents: 1. The best 32B base model. 2. The best 7B Western thinking & instruct models. 3. The first 32B (or larger) fully open reasoning model. This is a big

We present Olmo 3, our next family of fully open, leading language models. 
This family of 7B and 32B models represents:

1. The best 32B base model.
2. The best 7B Western thinking & instruct models.
3. The first 32B (or larger) fully open reasoning model.

This is a big
Aryaman Arora (@aryaman2020) 's Twitter Profile Photo

🫡 new paper neurons can be a sparse and interpretable basis for circuit tracing, once you make the right decisions about which neurons and how you circuit trace! i'm excited for how this affects future progress on circuits + automating interp

mconcat (@monoidconcat) 's Twitter Profile Photo

Okay finally assembled my AI workstation tower, with one RTX Pro 6000 and two RTX 5090s. Did some test run, satisfied so far.

Okay finally assembled my AI workstation tower, with one RTX Pro 6000 and two RTX 5090s.

Did some test run, satisfied so far.
Nathan Odle (@mov_axbx) 's Twitter Profile Photo

Getting close to powering up 4x RTX Pro 6000 WS. Using a board with MCIO vs PCIe risers made this build so, so much cleaner than 7x4090. Fourth PSU is for a couple more GPUs.

Getting close to powering up 4x RTX Pro 6000 WS.

Using a board with MCIO vs PCIe risers made this build so, so much cleaner than 7x4090.

Fourth PSU is for a couple more GPUs.