Benjamin Minixhofer (@bminixhofer) 's Twitter Profile
Benjamin Minixhofer

@bminixhofer

PhD Student @CambridgeLTL

ID: 3492830254

linkhttp://bmin.ai calendar_today30-08-2015 16:29:31

414 Tweet

1,1K Followers

364 Following

Benjamin Minixhofer (@bminixhofer) 's Twitter Profile Photo

We created Approximate Likelihood Matching, a principled (and very effective) method for *cross-tokenizer distillation*! With ALM, you can create ensembles of models from different families, convert existing subword-level models to byte-level and a bunch more🧵

We created Approximate Likelihood Matching, a principled (and very effective) method for *cross-tokenizer distillation*!

With ALM, you can create ensembles of models from different families, convert existing subword-level models to byte-level and a bunch more🧵
Benjamin Minixhofer (@bminixhofer) 's Twitter Profile Photo

.Markus Frohmann released a new version of wtpsplit which speeds up sentence segmentation by 4x-20x a couple of days ago thanks to a one-line change. The beauty/horrors of Python. github.com/segment-any-te…

.<a href="/FrohmannM/">Markus Frohmann</a> released a new version of wtpsplit which speeds up sentence segmentation by 4x-20x a couple of days ago thanks to a one-line change.

The beauty/horrors of Python.

github.com/segment-any-te…
Tokenization Workshop (TokShop) @ICML2025 (@tokshop2025) 's Twitter Profile Photo

🚨 NEW WORKSHOP ALERT 🚨 We're thrilled to announce the first-ever Tokenization Workshop (TokShop) at #ICML2025 ICML Conference! 🎉 Submissions are open for work on tokenization across all areas of machine learning. 📅 Submission deadline: May 30, 2025 🔗 tokenization-workshop.github.io

Valentin Hofmann (@vjhofmann) 's Twitter Profile Photo

Delighted there will finally be a workshop devoted to tokenization - a critical topic for LLMs and beyond! 🎉 Join us for the inaugural edition of TokShop at #ICML2025 ICML Conference in Vancouver this summer! 🤗

Arduin Findeis @ ICLR2025 (@arduinfindeis) 's Twitter Profile Photo

How exactly was the initial Chatbot Arena version of Llama 4 Maverick different from the public HuggingFace version?🕵️ I used our Feedback Forensics app to quantitatively analyse how exactly these two models differ. An overview…👇🧵

How exactly was the initial Chatbot Arena version of Llama 4 Maverick different from the public HuggingFace version?🕵️

I used our Feedback Forensics app to quantitatively analyse how exactly these two models differ. An overview…👇🧵
Benjamin Minixhofer (@bminixhofer) 's Twitter Profile Photo

While you're waiting for BLT to be supported by HuggingFace, why not try our Llama and Gemma models transferred to byte-level tokenization🔢 They perform quite well even though they are trained on just 1.2B bytes (only ~330M subword tokens), and have HF transformers support now!

While you're waiting for BLT to be supported by HuggingFace, why not try our Llama and Gemma models transferred to byte-level tokenization🔢

They perform quite well even though they are trained on just 1.2B bytes (only ~330M subword tokens), and have HF transformers support now!
Edoardo Ponti (@pontiedoardo) 's Twitter Profile Photo

To appear at #NAACL2025 (2 orals, 1 poster)! Coleman Haley: which classes of words are most grounded on (perceptual proxies of) meaning? Uri Berger: how do image descriptions vary across languages and cultures? Hanxu Hu: can LLMs follow sequential instructions? 🧵below

Piotr Nawrot (@p_nawrot) 's Twitter Profile Photo

Sparse attention is one of the most promising strategies to unlock long-context processing and long generation reasoning in LLMs. We performed the most comprehensive study on training-free sparse attention to date. Here is what we found:

Sparse attention is one of the most promising strategies to unlock long-context processing and long generation reasoning in LLMs.

We performed the most comprehensive study on training-free sparse attention to date.

Here is what we found:
Markus Frohmann (@frohmannm) 's Twitter Profile Photo

Wtpsplit, our text segmentation tool, just reached ⭐️1000 stars⭐️ on GitHub! Excited to see it is proving useful! Check it out here: github.com/segment-any-te… 🎉

Wtpsplit, our text segmentation tool, just reached ⭐️1000 stars⭐️ on GitHub! Excited to see it is proving useful!
Check it out here: github.com/segment-any-te… 🎉
Piotr Nawrot (@p_nawrot) 's Twitter Profile Photo

We built sparse-frontier — a clean abstraction that lets you focus on your custom sparse attention implementation while automatically inheriting vLLM’s optimizations and model support. As a PhD student, I've learned that sometimes the bottleneck in research isn't ideas — it's

We built sparse-frontier — a clean abstraction that lets you focus on your custom sparse attention implementation while automatically inheriting vLLM’s optimizations and model support.

As a PhD student, I've learned that sometimes the bottleneck in research isn't ideas — it's