Roy Schwartz (@royschwartznlp) 's Twitter Profile
Roy Schwartz

@royschwartznlp

Senior Lecturer at @CseHuji. #NLPROC

ID: 4883662141

linkhttps://schwartz-lab-huji.github.io/ calendar_today09-02-2016 15:30:51

246 Tweet

2,2K Takipçi

379 Takip Edilen

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Transformers are Multi-State RNNs Shows that decoder-only transformers can be conceptualized as infinite multi-state RNNs—an RNN variant with unlimited hidden state size arxiv.org/abs/2401.06104

Transformers are Multi-State RNNs

Shows that decoder-only transformers can be conceptualized as infinite multi-state RNNs—an RNN variant with unlimited hidden state size

arxiv.org/abs/2401.06104
AK (@_akhaliq) 's Twitter Profile Photo

Transformers are Multi-State RNNs paper page: huggingface.co/papers/2401.06… Transformers are considered conceptually different compared to the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only

Transformers are Multi-State RNNs

paper page: huggingface.co/papers/2401.06…

Transformers are considered conceptually different compared to the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only
Michael Hassid (@michaelhassid) 's Twitter Profile Photo

Transformers outperform RNNs as they operate differently. Do they? Excited to share our new paper: “Transformers are Multi-State RNNs” Paper: arxiv.org/abs/2401.06104 Code: github.com/schwartz-lab-N… 1/n

Transformers outperform RNNs as they operate differently.
Do they?

Excited to share our new paper: “Transformers are Multi-State RNNs”

Paper: arxiv.org/abs/2401.06104
Code: github.com/schwartz-lab-N…

1/n
UKP Lab (@ukplab) 's Twitter Profile Photo

Stop complaining about the bad review quality. Join forces and start research on #NLProc for #PeerReview! 🚨 A new white paper by over 20 top AI and NLP researchers provides a thorough discussion of AI assistance for scientific quality control. (1/🧵) 📑 arxiv.org/abs/2405.06563

Stop complaining about the bad review quality. Join forces and start research on #NLProc for #PeerReview!

🚨 A new white paper by over 20 top AI and NLP researchers provides a thorough discussion of AI assistance for scientific quality control. (1/🧵)

📑 arxiv.org/abs/2405.06563
Michael Hassid (@michaelhassid) 's Twitter Profile Photo

New version for “Transformers are Multi-State RNNs” is now on arxiv: arxiv.org/abs/2401.06104 What’s new? Efficiency analysis of TOVA (our KV compression policy) Extrapolation with TOVA Details below >> 1/3

Michael Hassid (@michaelhassid) 's Twitter Profile Photo

Which is better, running a 70B model once, or a 7B model 10 times? The answer might be surprising! Presenting our new Conference on Language Modeling paper: "The Larger the Better? Improved LLM Code-Generation via Budget Reallocation" arxiv.org/abs/2404.00725 1/n

Which is better, running a 70B model once, or a 7B model 10 times? The answer might be surprising!

Presenting our new <a href="/COLM_conf/">Conference on Language Modeling</a>  paper: "The Larger the Better? Improved LLM Code-Generation via Budget Reallocation"

arxiv.org/abs/2404.00725

1/n
Michael Hassid (@michaelhassid) 's Twitter Profile Photo

"Transformers are Multi-State RNNs", and our KV compression policy "TOVA", got accepted to #EMNLP2024! 🎉 See you in Miami! :) Paper: arxiv.org/abs/2401.06104

Guy Kaplan ✈️🇸🇬 ICLR2025 (@gkaplan38844) 's Twitter Profile Photo

📢Paper release📢 : 🔍 Ever wondered how LLMs understand words when all they see are tokens? 🧠 Our latest study uncovers how LLMs reconstruct full words from sub-word tokens, even when misspelled or previously unseen. arxiv.org/pdf/2410.05864 (preprint) 👀 👇 [1/7]

📢Paper release📢 :

🔍 Ever wondered how LLMs understand words when all they see are tokens? 🧠

Our latest study uncovers how LLMs reconstruct full words from sub-word tokens, even when misspelled or previously unseen.

arxiv.org/pdf/2410.05864 (preprint)
👀 👇

[1/7]
ACL 2025 (@aclmeeting) 's Twitter Profile Photo

What should the ACL peer review process be like in the future? Please cast your views in this survey: aclweb.org/portal/content… by 4th Nov 2024 #NLProc ACLRollingReview

Anna Rogers (@annargrs) 's Twitter Profile Photo

📢📢 Dear #NLProc people with strong opinions on peer review & ARR in particular: this is the ACL survey you've been waiting for. It covers core design of ARR, incl. the decoupling of acceptance reviews & decisions and length of review cycles. Don't say you were not asked! /1

Tamar Kolodny (@tamarkolodny) 's Twitter Profile Photo

It's been difficult to share good news from this part of the world. But it's long overdue - I am excited to share that I joined the Psychology Dept at Ben-Gurion University & Azrieli National Centre for Autism and Neurodev. ! Hooray for new endeavors and in hopes of better times.

Amit Ben-Artzy (@amit_benartzy) 's Twitter Profile Photo

In which layers does information flow from previous tokens to the current token? Presenting our new BlackboxNLP paper: “Attend First, Consolidate Later: On the Importance of Attention in Different LLM Layers” arxiv.org/abs/2409.03621 1/n

In which layers does information flow from previous tokens to the current token?

Presenting our new <a href="/BlackboxNLP/">BlackboxNLP</a>   paper: “Attend First, Consolidate Later: On the Importance of Attention in Different LLM Layers”

arxiv.org/abs/2409.03621

1/n
Roy Schwartz (@royschwartznlp) 's Twitter Profile Photo

Looking for emergency reviewers for October ARR. If someone can complete a review *today* (Sunday, Nov. 24), please DM me🙏 I have papers on efficiency, interpretability and speech

Tamer (@tamerghattas911) 's Twitter Profile Photo

🚀 New Paper Drop! 🚀 “On Pruning SSM LLMs” – We check the prunability of MAMBA🐍 based LLMs. We also release Smol2-Mamba-1.9B, a MAMBA based LLM distilled from Smol2-1.7B on 🤗: [huggingface.co/schwartz-lab/S…] 📖 Read more: [arxiv.org/abs/2502.18886] Roy Schwartz Michael Hassid

🚀 New Paper Drop! 🚀

“On Pruning SSM LLMs” – We check the prunability of MAMBA🐍 based LLMs.

We also release Smol2-Mamba-1.9B, a MAMBA based LLM distilled from Smol2-1.7B on 🤗: [huggingface.co/schwartz-lab/S…]

📖 Read more: [arxiv.org/abs/2502.18886]

<a href="/royschwartzNLP/">Roy Schwartz</a> <a href="/MichaelHassid/">Michael Hassid</a>
Guy Kaplan ✈️🇸🇬 ICLR2025 (@gkaplan38844) 's Twitter Profile Photo

✨ Ever tried generating an image from a prompt but ended up with unexpected outputs? Check out our new paper #FollowTheFlow - tackling T2I issues like bias, failed binding, and leakage from the textual encoding side! 💼🔍 arxiv.org/pdf/2504.01137 guykap12.github.io/guykap12.githu… 🧵[1/7]

Guy Kaplan ✈️🇸🇬 ICLR2025 (@gkaplan38844) 's Twitter Profile Photo

Heading to ICLR 2026 ✈️🧩 ‘Tokens→Words’ shows how LLMs build full‑word representations from sub‑word tokens and offers a tool for vocab expansion. 🚀 See our #ICLR2025 poster ‑ 26.4, 15:00‑17:30. 📄 arxiv.org/abs/2410.05864 🔗 guykap12.github.io/FromTokens2Wor… 👇

Heading to <a href="/iclr_conf/">ICLR 2026</a> ✈️🧩 ‘Tokens→Words’ shows how LLMs build full‑word representations from sub‑word tokens and offers a tool for vocab expansion. 🚀

See our #ICLR2025 poster ‑ 26.4,  15:00‑17:30.

📄 arxiv.org/abs/2410.05864
🔗 guykap12.github.io/FromTokens2Wor…

👇
Michael Hassid (@michaelhassid) 's Twitter Profile Photo

The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”. Link: arxiv.org/abs/2505.17813 1/n

The longer reasoning LLM thinks - the more likely to be correct, right?

Apparently not.

Presenting our paper: “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning”.

Link: arxiv.org/abs/2505.17813

1/n
Yair Brill (@yairbrill) 's Twitter Profile Photo

לפני חודש שלח לי ארי רפופורט, בן דוד של אמא שלי, מייל מפתיע: "אינני יודע אם אתה מודע למחלה שלי ולהישגי המדעיים", הוא פתח, "אובחנתי עם סרטן ריאות מסוג תאים קטנים, אחד הקטלניים שיש. נותרו לי עוד כמה חודשים... אני כותב אליך לבקש כתבה מדעית במוסף הארץ - כזו תעניין ללא ספק אנשים רבים"

לפני חודש שלח לי ארי רפופורט, בן דוד של אמא שלי, מייל מפתיע: "אינני יודע אם אתה מודע למחלה שלי ולהישגי המדעיים", הוא פתח, "אובחנתי עם סרטן ריאות מסוג תאים קטנים, אחד הקטלניים שיש. נותרו לי עוד כמה חודשים... אני כותב אליך לבקש כתבה מדעית במוסף הארץ - כזו תעניין ללא ספק אנשים רבים"