Amil Dravid (@_amildravid) Twitter Tweets • TwiCopy

Nick Jiang @ ICLR

5 months ago

Vision transformers have high-norm outliers that hurt performance and distort attention. While prior work removed them by retraining with “register” tokens, we find the mechanism behind outliers and make registers at ✨test-time✨—giving clean features and better performance! 🧵

thumb_up_off_alt995

chat_bubble_outline15

repeat134

shareShare

Yam Peleg

@yampeleg

5 months ago

One of the most interesting papers to read at the moment with extreme implications also for language transformers.

thumb_up_off_alt10

chat_bubble_outline1

repeat2

shareShare

Amil Dravid

@_amildravid

5 months ago

We are starting the afternoon session soon! Grand Ballroom C1 on the fourth floor!

thumb_up_off_alt10

chat_bubble_outline1

repeat0

shareShare

Tamar Rott Shaham

@tamarrottshaham

5 months ago

How do LMs track what humans believe? In our new work, we show they use a pointer-like mechanism we call lookback. Super proud of this work by Nikhil Prakash and team! This is the most intricate piece of LM reverse engineering I’ve seen!

thumb_up_off_alt13

chat_bubble_outline0

repeat2

shareShare

Andrew Ilyas

@andrew_ilyas

5 months ago

“How will my model behave if I change the training data?” Recent(-ish) work w/ Logan Engstrom: we nearly *perfectly* predict ML model behavior as a function of training data, saturating benchmarks for this problem (called “data attribution”).

“How will my model behave if I change the training data?”

Recent(-ish) work w/ <a href="/logan_engstrom/">Logan Engstrom</a>: we nearly *perfectly* predict ML model behavior as a function of training data, saturating benchmarks for this problem (called “data attribution”).

thumb_up_off_alt381

chat_bubble_outline10

repeat66

shareShare

Zhuang Liu

@liuzhuang1234

5 months ago

Can diffusion models appear to be learning, when they’re actually just memorizing the training data? We show and investigate this phenomenon in the context of neural network weight generation, in our recent paper “Generative Modeling of Weights: Generalization or Memorization?"

thumb_up_off_alt238

chat_bubble_outline3

repeat42

shareShare

TTIC

@ttic_connect

5 months ago

We’re proud to announce three new tenure-track assistant professors joining TTIC in Fall 2026: Yossi Gandelsman (Yossi Gandelsman), Will Merrill (William Merrill), and Nick Tomlin (Nicholas Tomlin). Meet them here: buff.ly/JH1DFtT

We’re proud to announce three new tenure-track assistant professors joining TTIC in Fall 2026: Yossi Gandelsman (<a href="/YGandelsman/">Yossi Gandelsman</a>), Will Merrill (<a href="/lambdaviking/">William Merrill</a>), and Nick Tomlin (<a href="/NickATomlin/">Nicholas Tomlin</a>). Meet them here: buff.ly/JH1DFtT

thumb_up_off_alt135

chat_bubble_outline3

repeat9

shareShare

Sonia

@soniajoseph_

5 months ago

The vision mechanistic interpretability workshop Mechanistic Interpretability for Vision @ CVPR2025 earlier this month at CVPR was very informative and fun! Looking forward to seeing this community grow. Thank you to the speakers and organizers trevordarrell David Bau Tamar Rott Shaham Yossi Gandelsman Joanna

The vision mechanistic interpretability workshop <a href="/miv_cvpr2025/">Mechanistic Interpretability for Vision @ CVPR2025</a> earlier this month at CVPR was very informative and fun! Looking forward to seeing this community grow.

Thank you to the speakers and organizers <a href="/trevordarrell/">trevordarrell</a> <a href="/davidbau/">David Bau</a> <a href="/TamarRottShaham/">Tamar Rott Shaham</a> <a href="/YGandelsman/">Yossi Gandelsman</a> <a href="/materzynska/">Joanna</a>

thumb_up_off_alt87

chat_bubble_outline0

repeat1

shareShare

Mechanistic Interpretability for Vision @ CVPR2025

@miv_cvpr2025

5 months ago

Thank you very much to our wonderful speakers and attendees Mechanistic Interpretability for Vision @ CVPR2025 who made the workshop a huge success. We hope to see you again next year! The workshop recording link can be accessed at: youtu.be/LTh86RMAWsI?si….

thumb_up_off_alt12

chat_bubble_outline1

repeat1

shareShare

Quanta Magazine

@quantamagazine

5 months ago

In a recent paper, physicists used two predictable factors to reproduce the “creativity” seen from image-generating AI. Webb Wright reports: quantamagazine.org/researchers-un…

thumb_up_off_alt142

chat_bubble_outline6

repeat35

shareShare

Amil Dravid

@_amildravid

5 months ago

Check out the updates to our paper. There are other ways to play around with these intriguing "register neurons." Also, check out our updated set of models with test-time registers. We have VLMs too! huggingface.co/collections/am…

thumb_up_off_alt13

chat_bubble_outline0

repeat0

shareShare