Itamar Zimerman (@itamarzimerman) 's Twitter Profile
Itamar Zimerman

@itamarzimerman

PhD candidate @ Tel Aviv University.
AI Research scientist @ IBM Research.
Interested in deep learning and algorithms.

ID: 825803747649544195

linkhttps://itamarzimm.github.io/ calendar_today29-01-2017 20:32:11

107 Tweet

406 Followers

436 Following

Itamar Zimerman (@itamarzimerman) 's Twitter Profile Photo

Assaf's analysis of recurrent LLMs such as Mamba and RWKV is important. While these models are designed to be efficient for long-context tasks, their effectiveness remains limited due to memory overflows - even at large scale. More details about memory overflows in the paper๐Ÿ“œ๐Ÿงต

Yoni Slutzky (@yonislutzky) 's Twitter Profile Photo

How do information flow patterns in Mamba compare to those in Transformers?๐Ÿšจ Our new #ACL2025 paper pits Mamba-1 and Mamba-2 against other Transformer-based models and uncovers both universal and architecture-specific information flow patterns. arxiv.org/abs/2505.24244 ๐Ÿงต

How do information flow patterns in Mamba compare to those in Transformers?๐Ÿšจ

Our new #ACL2025 paper pits Mamba-1 and Mamba-2 against other Transformer-based models and uncovers both universal and architecture-specific information flow patterns.
 arxiv.org/abs/2505.24244
 ๐Ÿงต
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ (@gm8xx8) 's Twitter Profile Photo

Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability ๐˜š๐˜ช๐˜ฎ๐˜ฑ๐˜ญ๐˜ฆ ๐˜ง๐˜ช๐˜น๐˜ฆ๐˜ด > ๐–ฏ๐—‹๐—ˆ๐–ป๐—…๐–พ๐—† All LRPโ€‘based XAI tools ignore PE, so relevance gets lost ๐–ฅ๐—‚๐—‘ Model each input as a (token,โ€ฏposition) pair and add PEโ€‘aware LRP rules

Revisiting LRP: Positional Attribution as the Missing Ingredient for Transformer Explainability

๐˜š๐˜ช๐˜ฎ๐˜ฑ๐˜ญ๐˜ฆ ๐˜ง๐˜ช๐˜น๐˜ฆ๐˜ด >

๐–ฏ๐—‹๐—ˆ๐–ป๐—…๐–พ๐—†
All LRPโ€‘based XAI tools ignore PE, so relevance gets lost

๐–ฅ๐—‚๐—‘
Model each input as a (token,โ€ฏposition) pair and add PEโ€‘aware LRP rules
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Reasoning depth of LLMs can now match task size, not a fixed budget. Reasoning models waste compute and often slip because their hidden chain of thought keeps running after the answer is clear. The authors learn an internal progress meter and nudge it so the model stops as

Reasoning depth of LLMs can now match task size, not a fixed budget.

Reasoning models waste compute and often slip because their hidden chain of thought keeps running after the answer is clear. 

The authors learn an internal progress meter and nudge it so the model stops as
Boaz Lavon (@boazlavon) 's Twitter Profile Photo

LLMs Donโ€™t Think Like Developers - Until Now. Together with shahar katz and liorwolf We made LLMs execute their code while generating it, just like a human developer. Meet EG-CFG: A new inference-time method that injects real-time execution feedback into the generation loop.

LLMs Donโ€™t Think Like Developers - Until Now.

Together with <a href="/KatzShachar/">shahar katz</a> and <a href="/liorwolf/">liorwolf</a> 
We made LLMs execute their code while generating it, just like a human developer.

Meet EG-CFG: A new inference-time method that injects real-time execution feedback into the generation loop.
Hila Chefer (@hila_chefer) 's Twitter Profile Photo

Exciting news from #ICML2025 & #ICCV2025 ๐Ÿฅณ - ๐Ÿฅ‡ VideoJAM accepted as *oral* at #ICML2025 (top 1%) - Two talks at #ICCV2025 โ˜๏ธinterpretability in the generative era โœŒ๏ธvideo customization - Organizing two #ICCV2025 workshops โ˜๏ธstructural priors for vision โœŒ๏ธlong video gen ๐Ÿงต๐Ÿ‘‡