nostalgebraist
@nostalgebraist
ID: 446638118
https://nostalgebraist.tumblr.com 26-12-2011 00:11:56
752 Tweet
1,1K Takipçi
405 Takip Edilen
Can LMs learn to faithfully describe their internal features and mechanisms? In our new paper led by Research Fellow Belinda Li, we find that they can—and that models explain themselves better than other models do.