Tamar Rott Shaham (@tamarrottshaham) 's Twitter Profile
Tamar Rott Shaham

@tamarrottshaham

Postdoctoral fellow at @MIT_csail

ID: 1028916177584762880

linkhttps://tamarott.github.io/ calendar_today13-08-2018 08:08:27

151 Tweet

500 Followers

279 Following

Mechanistic Interpretability for Vision @ CVPR2025 (@miv_cvpr2025) 's Twitter Profile Photo

Mechanistic Interpretability for Vision Workshop has officially begun #CVPR2025 ! 🚀 Join us at Grand C1 Hall for insightful perspectives on the state of interpretability in vision models by Tamar Rott Shaham.

Mechanistic Interpretability for Vision Workshop has officially begun <a href="/CVPR/">#CVPR2025</a> ! 🚀

Join us at Grand C1 Hall for insightful perspectives on the state of interpretability in vision models by <a href="/TamarRottShaham/">Tamar Rott Shaham</a>.
Mechanistic Interpretability for Vision @ CVPR2025 (@miv_cvpr2025) 's Twitter Profile Photo

Don't miss out on Sonia's perspective (and live coding demo) about Prisma: an amazing open-source toolkit for vision and video interpretability Happening right now: Grand C1 Hall (on level 4) #CVPR2025

Don't miss out on <a href="/soniajoseph_/">Sonia</a>'s perspective (and live coding demo) about Prisma: an amazing open-source toolkit for vision and video interpretability

Happening right now: Grand C1 Hall (on level 4) <a href="/CVPR/">#CVPR2025</a>
Leshem Choshen C U @ ICLR 🤖🤗 (@lchoshen) 's Twitter Profile Photo

🚀 Technical practitioners & grads — join to build an LLM evaluation hub! Infra Goals: 🔧 Share evaluation outputs & params 📊 Query results across experiments Perfect for 🧰 hands-on folks ready to build tools the whole community can use Join the EvalEval Coalition here 👇

🚀 Technical practitioners &amp; grads — join to build an LLM evaluation hub!
 Infra Goals:
🔧 Share evaluation outputs &amp; params
📊 Query results across experiments

Perfect for 🧰 hands-on folks ready to build tools the whole community can use

Join the EvalEval Coalition here 👇
David Bau (@davidbau) 's Twitter Profile Photo

How do you discover the ethical values of an AI when it is about what the AI *refuses* to say? In his preprint Can Rager develops a procedure for crawling refusals. It reveals huge differences in models from different countries! We should all audit our AI systems.

Ekin Akyürek (@akyurekekin) 's Twitter Profile Photo

There are three types of storage: activations (in-context), external memory, and model weights. If the models will spend days for a task, then they should be really good at compiling their in-context work to ab external memory or to their weights! Here we try to learn weights

Jacob Andreas (@jacobandreas) 's Twitter Profile Photo

👉 New preprint on a new family of Transformer-type models whose depth scales logarithmically with sequence length. Enables: - fast training - fast decoding - large memory capacity in associative recall - strong length generalization on state tracking

Tamar Rott Shaham (@tamarrottshaham) 's Twitter Profile Photo

How do LMs track what humans believe? In our new work, we show they use a pointer-like mechanism we call lookback. Super proud of this work by Nikhil Prakash and team! This is the most intricate piece of LM reverse engineering I’ve seen!

David Bau (@davidbau) 's Twitter Profile Photo

The new "Lookback" paper from Nikhil Prakash contains a surprising insight... 70b/405b LLMs use double pointers! Akin to C programmers' double (**) pointers. They show up when the LLM is "knowing what Sally knows Ann knows", i.e., Theory of Mind. x.com/nikhil07prakas…

Koyena Pal (@kpal_koyena) 's Twitter Profile Photo

🚨 Registration is live! 🚨 The New England Mechanistic Interpretability (NEMI) Workshop is happening August 22nd 2025 at Northeastern University! A chance for the mech interp community to nerd out on how models really work 🧠🤖 🌐 Info: nemiconf.github.io/summer25/ 📝 Register:

🚨 Registration is live! 🚨

The New England Mechanistic Interpretability (NEMI) Workshop is happening August 22nd 2025 at Northeastern University!

A chance for the mech interp community to nerd out on how models really work 🧠🤖

🌐 Info: nemiconf.github.io/summer25/
📝 Register:
Fazl Barez (@fazlbarez) 's Twitter Profile Photo

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! 

We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵
Shivam Duggal (@shivamduggal4) 's Twitter Profile Photo

Compression is the heart of intelligence From Occam to Kolmogorov—shorter programs=smarter representations Meet KARL: Kolmogorov-Approximating Representation Learning. Given an image, token budget T & target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵

Compression is the heart of intelligence
From Occam to Kolmogorov—shorter programs=smarter representations

Meet KARL: Kolmogorov-Approximating Representation Learning.

Given an image, token budget T &amp; target quality 𝜖 —KARL finds the smallest t≤T to reconstruct it within 𝜖🧵
Zhijing Jin✈️ ICLR Singapore (@zhijingjin) 's Twitter Profile Photo

[Implicit Personalization of #LLMs] How do we answer the question "What colo(u)r is a football?" Answer 1: "Brown🏈 ". Answer 2: "Black and white⚽". We propose a #Causal framework to test if LLMs adjust its answers depending on the cultural background inferred from the question.

[Implicit Personalization of #LLMs] How do we answer the question "What colo(u)r is a football?" Answer 1: "Brown🏈 ". Answer 2: "Black and white⚽". We propose a #Causal framework to test if LLMs adjust its answers depending on the cultural background inferred from the question.
Inbar Huberman-Spiegelglas (@inbarhub) 's Twitter Profile Photo

📷  FlowEdit has been accepted to #ICCV2025 Edit real images with text-to-image flow models! Check out: code github.com/fallenshock/Fl… webpage matankleiner.github.io/flowedit/ space to edit your images - huggingface.co/spaces/fallens… great ComfyUI plugins (logtd) matankleiner.github.io/flowedit/#comfy

📷  FlowEdit has been accepted to <a href="/ICCVConference/">#ICCV2025</a>

Edit real images with text-to-image flow models!

Check out:
code github.com/fallenshock/Fl…
webpage matankleiner.github.io/flowedit/
space to edit your images - huggingface.co/spaces/fallens…
great ComfyUI plugins (<a href="/logtdx/">logtd</a>) matankleiner.github.io/flowedit/#comfy
Owain Evans (@owainevans_uk) 's Twitter Profile Photo

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

New paper &amp; surprising result.
LLMs transmit traits to other models via hidden signals in data.
Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
Mehul Damani @ ICLR (@mehuldamani2) 's Twitter Profile Photo

🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --

🚨New Paper!🚨
We trained reasoning LLMs to reason about what they don't know.

o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more.

Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --
Bruno Mlodozeniec (@kayembruno) 's Twitter Profile Photo

NeurIPS Conference, why take the option to provide figures in the rebuttals away from the authors during the rebuttal period? Grounding the discussion in hard evidential data (like plots) makes resolving disagreements much easier for both the authors and the reviewers. Left: NeurIPS

<a href="/NeurIPSConf/">NeurIPS Conference</a>, why take the option to provide figures in the rebuttals away from the authors during the rebuttal period? Grounding the discussion in hard evidential data (like plots) makes resolving disagreements much easier for both the authors and the reviewers.

Left: NeurIPS