Sotiris Anagnostidis
@sanagnostidis
PhD in ETH Zürich. MLP-pilled 💊. Previously @Meta GenAI, @GoogleDeepMind, @Huawei, @ntua
ID: 1471221876693377038
http://sanagnos.pages.dev/ 15-12-2021 20:53:39
51 Tweet
165 Takipçi
455 Takip Edilen
Outlier Features (OFs) aka “neurons with big features” emerge in standard transformer training & prevent benefits of quantisation🥲but why do OFs appear & which design choices minimise them? Our new work (+Lorenzo Noci Daniele Paliotta Imanol Schlag T. Hofmann) takes a look👀🧵
Are LLMs easily influenced? Interesting work from Sotiris Anagnostidis and Jannis Bulian TLDR: Having an LLM advocate for a question answer in the prompt significantly influences predictions
Ever wondered how the loss landscape of Transformers differs from that of other architectures? Or which Transformer components make its loss landscape unique? With Sidak Pal Singh & Felix Dangel, we explore this via the Hessian in our #ICLR2025 spotlight paper! Key insights👇 1/8