@sean_pixel : you can now improve RL models WITHOUT ANY TRAINING. Inspired by mechanistic interpretability for LLMs (cred: @jacobdunefsky @mlpowered @NeelNanda5), I applied sparse-transcoder methods to a CartPole policy and saw a +24% performance increase with zero additional training. (1/9) • TwiCopy

seanpixel 🫧

@sean_pixel

+ Follow

jack of some trades

ID: 1108856460304216064

linkhttp://seanpixel.com calendar_today21-03-2019 22:22:36

1,1K Tweet

933 Takipçi

606 Takip Edilen

seanpixel 🫧

@sean_pixel

2 months ago

you can now improve RL models WITHOUT ANY TRAINING. Inspired by mechanistic interpretability for LLMs (cred: Jacob Dunefsky Emmanuel Ameisen Neel Nanda), I applied sparse-transcoder methods to a CartPole policy and saw a +24% performance increase with zero additional training. (1/9)

you can now improve RL models WITHOUT ANY TRAINING.

Inspired by mechanistic interpretability for LLMs (cred: <a href="/jacobdunefsky/">Jacob Dunefsky</a> <a href="/mlpowered/">Emmanuel Ameisen</a> <a href="/NeelNanda5/">Neel Nanda</a>), I applied sparse-transcoder methods to a CartPole policy and saw a +24% performance increase with zero additional training.

(1/9)

thumb_up_off_alt25

chat_bubble_outline3

repeat5

shareShare