Sumedh Hindupur (@sumedh_hrs) 's Twitter Profile
Sumedh Hindupur

@sumedh_hrs

Ph.D. student at Harvard

ID: 1738250330805276672

calendar_today22-12-2023 17:29:26

9 Tweet

36 Takipçi

58 Takip Edilen

Ekdeep Singh Lubana (@ekdeepl) 's Twitter Profile Photo

🚨 New paper alert! Linear representation hypothesis (LRH) argues concepts are encoded as **sparse sum of orthogonal directions**, motivating interpretability tools like SAEs. But what if some concepts don’t fit that mold? Would SAEs capture them? 🤔 1/11