Bram (@bramvanroy) 's Twitter Profile
Bram

@bramvanroy

@ku_leuven @ccl_kuleuven: Creative #NLG 🖋️
@ivdnt: Dutch #NLProc and #LLMs

Creator of Dutch LLMs 🤖

Fellow at @huggingface 🤗

Prev. @lt3ugent, @SignON

ID: 361306433

linkhttps://bramvanroy.github.io/ calendar_today24-08-2011 15:46:54

4,4K Tweet

1,1K Takipçi

814 Takip Edilen

Dimitar Shterionov (@dshterionov) 's Twitter Profile Photo

Hey MT enthusiasts, 2nd Call for Workshop proposal for EAMT 2026! Have an awesome idea related to new (and old) trends in MT you wish to discuss with amazing people? We are open to host your event! Submit by 05-11-2025: easychair.org/my/conference?… Info: eamt2026.org

Dimitar Shterionov (@dshterionov) 's Twitter Profile Photo

Dear MT enthusiast, Here comes the 2nd Call for Tutorial proposal for the EAMT 2026! Have something amazing you wish to demonstrate to or show to the EAMT crowd? We are open to host your event! Check for more information: eamt2026.org/calls-for-tuto…

Guilherme Penedo (@gui_penedo) 's Twitter Profile Photo

New dataset release: 🌐FineWiki This is an updated and better extracted version of Wikipedia, covering 325+ languages. Unlike the old dataset from 2023, we kept all the math content, tables, properly rendered templates, and extracted key facts. Examples and highlights below.

New dataset release: 🌐FineWiki

This is an updated and better extracted version of Wikipedia, covering 325+ languages.

Unlike the old dataset from 2023, we kept all the math content, tables, properly rendered templates, and extracted key facts.

Examples and highlights below.
Bram (@bramvanroy) 's Twitter Profile Photo

My tips for vLLM for offline, batched generation for things like "You are a rater. These are requirements [...]. Rate this text:": 1. tune max_num_batched_tokens/max seq; 2. use chunked prefill and prefix cache; 3. run one warmup with prefix; 4. sort by prompt length 🚀

Bram (@bramvanroy) 's Twitter Profile Photo

LinkedIn with the hot (toxic) takes again. Do not forget everyone: you work to (pay to) live, not the other way around!

LinkedIn with the hot (toxic) takes again. Do not forget everyone: you work to (pay to) live, not the other way around!
Shayne Longpre (@shayneredford) 's Twitter Profile Photo

📢Thrilled to introduce ATLAS 🗺️: scaling laws beyond English, for pretraining, finetuning, and the curse of multilinguality. The largest public, multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer: 🌍Are scaling laws different by

📢Thrilled to introduce ATLAS 🗺️: scaling laws beyond English, for pretraining, finetuning, and the curse of multilinguality.

The largest public, multilingual scaling study to-date—we ran 774 exps (10M-8B params, 400+ languages) to answer:

🌍Are scaling laws different by
Shayne Longpre (@shayneredford) 's Twitter Profile Photo

Q4: When should you pretrain from scratch vs finetune a multilingual checkpoint? 🌟Answer: We found compute-optimal crossover points for every model size. Rough rule of thumb: finetune if your compute budget C is < 10^10 x N ^1.54, otherwise pretrain. 8/

Q4: When should you pretrain from scratch vs finetune a multilingual checkpoint?

🌟Answer: We found compute-optimal crossover points for every model size.

Rough rule of thumb: finetune if your compute budget C is &lt; 10^10 x N ^1.54, otherwise pretrain.

8/
Bram (@bramvanroy) 's Twitter Profile Photo

Took an image I saw on X and asked #GPT5 whether the code was correct, biasing it slightly with a distractor hint. And yes - it starts out by saying "that's incorrect" but ends with "that's correct", without admitting fault. Very interesting to see this: chatgpt.com/s/t_690366ef38…

Bram (@bramvanroy) 's Twitter Profile Photo

These days, I am mostly using ~30B models in FP8 to fit well on a 48GB card. Shout out to Red Hat AI for pushing out many FP8 versions of popular models (and other quants)!

Bram (@bramvanroy) 's Twitter Profile Photo

Does anyone know anyone at the Mistral AI team? For some reason they do not have the chat template in the tokenizer config, which makes it tedious to use in typical pipelines. Chat template is in their Processor class but not in Tokenizer. huggingface.co/mistralai/Mist…

Dimitar Shterionov (@dshterionov) 's Twitter Profile Photo

!DEADLINE EXTENSION! We have decided to extend the EAMT2026 workshop proposal submission deadline with a week! New deadline: 12 November 2025! You can find the call for papers and read more on our website: eamt2026.org Submission link: easychair.org/my/conference?…