Benjamin Minixhofer (@bminixhofer) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

We created Approximate Likelihood Matching, a principled (and very effective) method for *cross-tokenizer distillation*! With ALM, you can create ensembles of models from different families, convert existing subword-level models to byte-level and a bunch more🧵

thumb_up_off_alt90

chat_bubble_outline2

repeat28

shareShare

Benjamin Minixhofer

@bminixhofer

2 months ago

.Markus Frohmann released a new version of wtpsplit which speeds up sentence segmentation by 4x-20x a couple of days ago thanks to a one-line change. The beauty/horrors of Python. github.com/segment-any-te…

.<a href="/FrohmannM/">Markus Frohmann</a> released a new version of wtpsplit which speeds up sentence segmentation by 4x-20x a couple of days ago thanks to a one-line change.

The beauty/horrors of Python.

github.com/segment-any-te…

thumb_up_off_alt7

chat_bubble_outline1

repeat0

shareShare

Tokenization Workshop (TokShop) @ICML2025

@tokshop2025

2 months ago

🚨 NEW WORKSHOP ALERT 🚨 We're thrilled to announce the first-ever Tokenization Workshop (TokShop) at #ICML2025 ICML Conference! 🎉 Submissions are open for work on tokenization across all areas of machine learning. 📅 Submission deadline: May 30, 2025 🔗 tokenization-workshop.github.io

thumb_up_off_alt30

chat_bubble_outline2

repeat12

shareShare

Valentin Hofmann

@vjhofmann

2 months ago

Delighted there will finally be a workshop devoted to tokenization - a critical topic for LLMs and beyond! 🎉 Join us for the inaugural edition of TokShop at #ICML2025 ICML Conference in Vancouver this summer! 🤗

thumb_up_off_alt28

chat_bubble_outline0

repeat4

shareShare

Arduin Findeis @ ICLR2025

@arduinfindeis

2 months ago

How exactly was the initial Chatbot Arena version of Llama 4 Maverick different from the public HuggingFace version?🕵️ I used our Feedback Forensics app to quantitatively analyse how exactly these two models differ. An overview…👇🧵

thumb_up_off_alt23

chat_bubble_outline3

repeat6

shareShare

Benjamin Minixhofer

@bminixhofer

2 months ago

While you're waiting for BLT to be supported by HuggingFace, why not try our Llama and Gemma models transferred to byte-level tokenization🔢 They perform quite well even though they are trained on just 1.2B bytes (only ~330M subword tokens), and have HF transformers support now!

thumb_up_off_alt20

chat_bubble_outline1

repeat2

shareShare

Edoardo Ponti

@pontiedoardo

2 months ago

To appear at #NAACL2025 (2 orals, 1 poster)! Coleman Haley: which classes of words are most grounded on (perceptual proxies of) meaning? Uri Berger: how do image descriptions vary across languages and cultures? Hanxu Hu: can LLMs follow sequential instructions? 🧵below

thumb_up_off_alt36

chat_bubble_outline1

repeat6

shareShare

Piotr Nawrot

@p_nawrot

2 months ago

Sparse attention is one of the most promising strategies to unlock long-context processing and long generation reasoning in LLMs. We performed the most comprehensive study on training-free sparse attention to date. Here is what we found:

thumb_up_off_alt596

chat_bubble_outline5

repeat102

shareShare

Markus Frohmann

@frohmannm

a month ago

Wtpsplit, our text segmentation tool, just reached ⭐️1000 stars⭐️ on GitHub! Excited to see it is proving useful! Check it out here: github.com/segment-any-te… 🎉

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Valentin Hofmann

@vjhofmann

a month ago

Excited to see our study on linguistic generalization in LLMs featured by University of Oxford News!

thumb_up_off_alt19

chat_bubble_outline0

repeat2

shareShare

Piotr Nawrot

@p_nawrot

14 hours ago

We built sparse-frontier — a clean abstraction that lets you focus on your custom sparse attention implementation while automatically inheriting vLLM’s optimizations and model support. As a PhD student, I've learned that sometimes the bottleneck in research isn't ideas — it's

thumb_up_off_alt126

chat_bubble_outline2

repeat19

shareShare