Pierre Colombo (@pierrecolombo6) 's Twitter Profile
Pierre Colombo

@pierrecolombo6

Associate Professor at CentraleSupelec (Paris Saclay) - CSO equall.ai - NLP/Law

ID: 1316746180085403655

linkhttps://pierrecolombo.github.io/ calendar_today15-10-2020 14:22:18

604 Tweet

513 Followers

1,1K Following

Duarte Alves (@duartemralves) 's Twitter Profile Photo

๐Ÿš€ Excited to announce EuroBERT: a new multilingual encoder model family for European & global languages! ๐ŸŒ ๐Ÿ”น EuroBERT is trained on a massive 5 trillion-token dataset across 15 languages and includes recent architecture advances such as GQA, RoPE & RMSNorm.ย ๐Ÿ”น

๐Ÿš€ Excited to announce EuroBERT: a new multilingual encoder model family for European & global languages! ๐ŸŒ

๐Ÿ”น EuroBERT is trained on a massive 5 trillion-token dataset across 15 languages and includes recent architecture advances such as GQA, RoPE & RMSNorm.ย ๐Ÿ”น
Duarte Alves (@duartemralves) 's Twitter Profile Photo

๐Ÿงต (3/7) ๐ŸŒ EuroBERT is open-source: ๐Ÿ‘‰ Models (210M, 610M, 2.1B params) ๐Ÿ‘‰ Training snapshots ๐Ÿ‘‰ Full training framework Explore here: [huggingface.co/EuroBERT](huggingface.co/EuroBERT) Code coming soon! [github.com/Nicolas-BZRD/Eโ€ฆ](github.com/Nicolas-BZRD/Eโ€ฆ)

Duarte Alves (@duartemralves) 's Twitter Profile Photo

๐Ÿงต (7/7) ๐Ÿ“– Check out our blog post for more insights: huggingface.co/blog/EuroBERT/โ€ฆ ๐Ÿ“„ Read more in our paper: arxiv.org/abs/2503.05500

Manuel Faysse (@manuelfaysse) 's Twitter Profile Photo

๐Ÿšจ Introducing EuroBERT, a family of multilingual encoder models (210M to 2.1B params) trained on 5T tokens with a 8,192 sequence length and all the modern bells and whistles! It's open source, and hopefully the perfect base model to train multilingual embeddings ! (1/N)

๐Ÿšจ Introducing EuroBERT, a family of multilingual encoder models (210M to 2.1B params) trained on 5T tokens with a 8,192 sequence length and all the modern bells and whistles! It's open source, and hopefully the perfect base model to train multilingual embeddings ! (1/N)
Antoine Chaffin (@antoine_chaffin) 's Twitter Profile Photo

More encoder upgrade and multilingual this time (so no excuse not to try it)! Great work from the team, have been in some discussions regarding these models and was really looking forward to the release! ๐Ÿš€ Congratulations Nicolas Boizard and Manuel Faysse!

Benjamin Claviรฉ (@bclavie) 's Twitter Profile Photo

More BERTs for the Modern era. This is super exciting, encoders are no longer dead ๐Ÿ˜„ The coolest aspect: with so many new proofs that encoders are small-param-count beasts, this'll hopefully spark a lot more research onto making them even better in creative ways...

tomaarsen (@tomaarsen) 's Twitter Profile Photo

An assembly of 18 European companies, labs, and universities have banded together to launch ๐Ÿ‡ช๐Ÿ‡บ EuroBERT! It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc. Details in ๐Ÿงต

An assembly of 18 European companies, labs, and universities have banded together to launch ๐Ÿ‡ช๐Ÿ‡บ EuroBERT! 

It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc. 

Details in ๐Ÿงต
Fanny Jourdan (@fannyjrd_) 's Twitter Profile Photo

EuroBERT is out and it's insane! ๐Ÿ‡ช๐Ÿ‡บ It's the most powerful multilingual encoder model family at SOTA across a wide range of tasks: retrieval, classif, regression, maths, and code. 3 sizes: 210M, 610M, and 2.1B parameters, with support sequence lengths up to 8,192 tokens. ๐Ÿ“– โคต๏ธ

EuroBERT is out and it's insane! ๐Ÿ‡ช๐Ÿ‡บ
It's the most powerful multilingual encoder model family at SOTA across a wide range of tasks: retrieval, classif, regression, maths, and code. 
3 sizes: 210M, 610M, and 2.1B parameters, with support sequence lengths up to 8,192 tokens. ๐Ÿ“– โคต๏ธ
Igor Tica (@itica007) 's Twitter Profile Photo

[EuroBERT: Multilingual Embedding Model] ๐Ÿ”ฅ There is a new open embedding model in town - EuroBERT claims superior performance across a diverse set of benchmarks, spanning multilingual capabilities, mathematics, and coding. It even outperforms ModernBERT on code and math

[EuroBERT: Multilingual Embedding Model] ๐Ÿ”ฅ

There is a new open embedding model in town - EuroBERT claims superior performance across a diverse set of benchmarks, spanning multilingual capabilities, mathematics, and coding.

It even outperforms ModernBERT on code and math
Peter Sarlin (@petersarlin) 's Twitter Profile Photo

Not just yet another LLM ๐Ÿ‡ช๐Ÿ‡บ EuroBERT provides open multilingual encoder models to power retrieval, classification & embeddings across 15 languages. But it is yet another model trained on AMD compute platforms. ๐Ÿš€ European companies, labs, and universities have made this come

Not just yet another LLM ๐Ÿ‡ช๐Ÿ‡บ EuroBERT provides open multilingual encoder models to power retrieval, classification &amp; embeddings across 15 languages. But it is yet another model trained on <a href="/AMD/">AMD</a> compute platforms. ๐Ÿš€

European companies, labs, and universities have made this come
Nicolas Boizard (@n1colais) 's Twitter Profile Photo

Great to see the community using EuroBERT! As hoped, itโ€™s proving to be an excellent foundation model, especially for information retrieval tasks across multiple languages after just one epoch of finetuning. Check it out: huggingface.co/Omartificial-Iโ€ฆ: Eng.Omar

Great to see the community using EuroBERT! As hoped, itโ€™s proving to be an excellent foundation model, especially for information retrieval tasks across multiple languages after just one epoch of finetuning. Check it out: huggingface.co/Omartificial-Iโ€ฆ:
<a href="/Engomar_10/">Eng.Omar</a>
Manuel Faysse (@manuelfaysse) 's Twitter Profile Photo

๐ŸšจWe are moving Visual Document Retrieval Evaluation to MTEB! Starting today, ViDoRe V1 and V2, but soon joined by many other benchmarks will benefit from first-class support in MTEB, enabling adding models and tasks more easily and more collaboratively! More in ๐Ÿงต(1/N)

๐ŸšจWe are moving Visual Document Retrieval Evaluation to MTEB! Starting today, ViDoRe V1 and V2, but soon joined by many other benchmarks will benefit from first-class support in MTEB, enabling adding models and tasks more easily and more collaboratively! More in ๐Ÿงต(1/N)