Christopher Potts
@chrisgpotts
Stanford Professor of Linguistics and, by courtesy, of Computer Science, and member of @stanfordnlp and @StanfordAILab. He/Him/His.
ID: 408714449
http://web.stanford.edu/~cgpotts/ 09-11-2011 19:59:28
2,2K Tweet
11,11K Followers
633 Following
The Linear Representation Hypothesis is now widely adopted despite its highly restrictive nature. Here, Csordás Róbert, Atticus Geiger, Christopher Manning & I present a counterexample to the LRH and argue for more expressive theories of interpretability: arxiv.org/abs/2408.10920