Ekdeep Singh Lubana (@ekdeepl) 's Twitter Profile
Ekdeep Singh Lubana

@ekdeepl

Postdoc at CBS-NTT Program on Physics of Intelligence, Harvard University.

ID: 944451685711273984

linkhttp://ekdeepslubana.github.io calendar_today23-12-2017 06:16:43

445 Tweet

1,1K Followers

1,1K Following

Ekdeep Singh Lubana (@ekdeepl) 's Twitter Profile Photo

🚨 New paper alert! Linear representation hypothesis (LRH) argues concepts are encoded as **sparse sum of orthogonal directions**, motivating interpretability tools like SAEs. But what if some concepts don’t fit that mold? Would SAEs capture them? 🤔 1/11