TimDarcet (@timdarcet) 's Twitter Profile
TimDarcet

@timdarcet

PhD student, building big vision models @ INRIA & FAIR (Meta)

ID: 1371396662925606913

calendar_today15-03-2021 09:44:31

982 Tweet

3,3K Takipçi

728 Takip Edilen

TimDarcet (@timdarcet) 's Twitter Profile Photo

Summary of "Massive activations in LLMs": - "artifact" tokens are in all transformers, ViTs and LLMs - their weirdness is ~only on 1 channel - they are the same as the quantization outliers - their purpose is *not* global information - there's a fix simpler than registers