@dakingrai : [1/6] Mechanistic Interpretability (MI) is an emerging sub-field of interpretability that aims to understand LMs by reverse-engineering its underlying computation. Here we present a comprehensive survey curated specifically as a 𝐠𝐮𝐢𝐝𝐞 𝐟𝐨𝐫 𝐧𝐞𝐰𝐜𝐨𝐦𝐞𝐫𝐬 𝐭𝐨 𝐭𝐡𝐢𝐬 • TwiCopy

Daking Rai

@dakingrai

+ Follow

CS PhD Student @GeorgeMasonU

ID: 2828986548

linkhttps://dakingrai.github.io/ calendar_today24-09-2014 00:53:57

29 Tweet

179 Followers

305 Following

Daking Rai

@dakingrai

a year ago

[1/6] Mechanistic Interpretability (MI) is an emerging sub-field of interpretability that aims to understand LMs by reverse-engineering its underlying computation. Here we present a comprehensive survey curated specifically as a 𝐠𝐮𝐢𝐝𝐞 𝐟𝐨𝐫 𝐧𝐞𝐰𝐜𝐨𝐦𝐞𝐫𝐬 𝐭𝐨 𝐭𝐡𝐢𝐬

thumb_up_off_alt308

chat_bubble_outline4

repeat58

shareShare