Mor Geva (@megamor2) 's Twitter Profile
Mor Geva

@megamor2

ID: 850356925535531009

linkhttps://mega002.github.io/ calendar_today07-04-2017 14:37:44

450 Tweet

1,1K Takipรงi

509 Takip Edilen

Sohee Yang (@soheeyang_) 's Twitter Profile Photo

Excited to share that the code and datasets for our papers on latent multi-hop reasoning are finally available on GitHub: github.com/google-deepminโ€ฆ We hope these resources support further research in this area. Thanks for your patience as we worked through the release process!

Mor Geva (@megamor2) 's Twitter Profile Photo

ื‘ืช ืฉื ืชื™ื™ื ืœืžื“ื” ืœื•ืžืจ ืื–ืขืงื” ื”ืจื‘ื” ืœืคื ื™ ืฉืœืžื“ื” ืœื•ืžืจ ืžื™ืœื™ื ื ื—ืžื“ื•ืช ื™ื•ืชืจ ื›ืžื• ืคืฉื˜ื™ื“ื” ืื• ื—ื‘ื™ืชื”. ืขื›ืฉื™ื• ื’ื ื”ืชื—ื™ืœื” ืœืกื‘ื•ืœ ืžื”ื›ืœืœืช ื™ืชืจ ื•ืงื•ืจืืช ืœืงื•ืœื•ืช ืฉืœ ืืžื‘ื•ืœื ืก ืื• ืฉืœ ื™ืœื“ื™ื ืฆื•ืขืงื™ื ืื–ืขืงื”

neuronpedia (@neuronpedia) 's Twitter Profile Photo

Announcement: we're open sourcing Neuronpedia! ๐Ÿš€ This includes all our mech interp tools: the interpretability API, steering, UI, inference, autointerp, search, plus 4 TB of data - cited by 35+ research papers and used by 50+ write-ups. What you can do with OSS Neuronpedia: ๐Ÿงต

Tal Haklay (@tal_haklay) 's Twitter Profile Photo

๐Ÿšจ Call for Papers is Out! The First Workshop on ๐€๐œ๐ญ๐ข๐จ๐ง๐š๐›๐ฅ๐ž ๐ˆ๐ง๐ญ๐ž๐ซ๐ฉ๐ซ๐ž๐ญ๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ will be held at ICML 2025 in Vancouver! ๐Ÿ“… Submission Deadline: May 9 Follow us >> Actionable Interpretability Workshop ICML2025 ๐Ÿง Topics of interest include: ๐Ÿ‘‡

๐Ÿšจ Call for Papers is Out!

The First Workshop on ๐€๐œ๐ญ๐ข๐จ๐ง๐š๐›๐ฅ๐ž ๐ˆ๐ง๐ญ๐ž๐ซ๐ฉ๐ซ๐ž๐ญ๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ will be held at ICML 2025 in Vancouver!

๐Ÿ“… Submission Deadline: May 9
Follow us &gt;&gt; <a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a>

๐Ÿง Topics of interest include: ๐Ÿ‘‡
Actionable Interpretability Workshop ICML2025 (@actinterp) 's Twitter Profile Photo

๐Ÿšจ We're looking for reviewers for the workshop! If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input. Sign up to review >>๐Ÿ’ก๐Ÿ”

๐Ÿšจ We're looking for reviewers for the workshop!

If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input.

Sign up to review &gt;&gt;๐Ÿ’ก๐Ÿ”
Hadas Orgad (@orgadhadas) 's Twitter Profile Photo

Position papers wanted! For the First Workshop on Actionable Interpretability, weโ€™re looking for diverse perspectives on the state of the field. Should certain areas of interpretability research be developed further? Are there key metrics we should prioritize? Or do you have >>

Position papers wanted!

For the First Workshop on Actionable Interpretability, weโ€™re looking for diverse perspectives on the state of the field. Should certain areas of interpretability research be developed further? Are there key metrics we should prioritize? Or do you have &gt;&gt;
Jiuding Sun (@jiudingsun) 's Twitter Profile Photo

๐Ÿ’จ A new architecture of automating mechanistic interpretability with causal interchange intervention! #ICLR2025 ๐Ÿ”ฌNeural networks are particularly good at discovering patterns from high-dimensional data, so we trained them to ... interpret themselves! ๐Ÿง‘โ€๐Ÿ”ฌ 1/4

๐Ÿ’จ A new architecture of automating mechanistic interpretability with causal interchange intervention! #ICLR2025

๐Ÿ”ฌNeural networks are particularly good at discovering patterns from high-dimensional data, so we trained them to ... interpret themselves! ๐Ÿง‘โ€๐Ÿ”ฌ

1/4
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

The Leaderboard Illusion - Identifies systematic issues that have resulted in a distorted playing field of Chatbot Arena - Identifies 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release

The Leaderboard Illusion

- Identifies systematic issues that have resulted in a distorted playing field of Chatbot Arena

- Identifies 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release
Shiqi Chen (@shiqi_chen17) 's Twitter Profile Photo

๐Ÿš€๐Ÿ”ฅ Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. ๐ŸŒ๐Ÿ” Paper:

Hadas Orgad (@orgadhadas) 's Twitter Profile Photo

Deadline extended! โณ The Actionable Interpretability Workshop at #ICML2025 has moved its submission deadline to May 19th. More time to submit your work ๐Ÿ”๐Ÿง โœจย Donโ€™tย missย out!

Deadline extended! โณ

The Actionable Interpretability Workshop at #ICML2025 has moved its submission deadline to May 19th. More time to submit your work ๐Ÿ”๐Ÿง โœจย Donโ€™tย missย out!
clem ๐Ÿค— (@clementdelangue) 's Twitter Profile Photo

This is the coolest dataset I've seen on Hugging Face today: action labels of 1v1 races in Gran Turismo 4 to train a multiplayer world model! Great stuff by Enigma!

This is the coolest dataset I've seen on <a href="/huggingface/">Hugging Face</a> today: action labels of 1v1 races in Gran Turismo 4 to train a multiplayer world model!

Great stuff by <a href="/EnigmaLabsAI/">Enigma</a>!
Percy Liang (@percyliang) 's Twitter Profile Photo

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:
Mor Geva (@megamor2) 's Twitter Profile Photo

Help!! We got way more submissions than expected, and are now looking for reviewers! Please sign up in this form if you can do 2-3 reviews in the next few weeks ๐Ÿ‘‡ docs.google.com/forms/d/e/1FAIโ€ฆ

Yoav Gur Arieh (@guryoav) 's Twitter Profile Photo

Can we precisely erase conceptual knowledge from LLM parameters? Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge. We introduce๐Ÿช๐๐ˆ๐’๐‚๐„๐’ โ€” a general framework for Precise In-parameter Concept EraSure. ๐Ÿงต 1/

Can we precisely erase conceptual knowledge from LLM parameters?
Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge.

We introduce๐Ÿช๐๐ˆ๐’๐‚๐„๐’ โ€” a general framework for Precise In-parameter Concept EraSure. ๐Ÿงต 1/
Mor Geva (@megamor2) 's Twitter Profile Photo

Removing knowledge from LLMs is HARD. Yoav Gur Arieh proposes a powerful approach that disentangles the MLP parameters to edit them in high resolution and remove target concepts from the model. Check it out!