Belen Alastruey (@b_alastruey) 's Twitter Profile
Belen Alastruey

@b_alastruey

PhD student @AIatMeta & @PSL_univ. Previously: @amazon Alexa, @apple MT, @mtupc1

ID: 1455258712961167361

calendar_today01-11-2021 19:41:45

85 Tweet

753 Followers

336 Following

Belen Alastruey (@b_alastruey) 's Twitter Profile Photo

Happy to share Linguini🍝, a benchmark to evaluate linguistic reasoning in LLMs without relying on prior language-specific knowledge. We show the task is still hard for SOTA models, achieving below 25% accuracy. 📄: arxiv.org/pdf/2409.12126

Happy to share Linguini🍝, a benchmark to evaluate linguistic reasoning in LLMs without relying on prior language-specific knowledge. 

We show the task is still hard for SOTA models, achieving below 25% accuracy.

📄: arxiv.org/pdf/2409.12126
João Maria Janeiro (@joaomjaneiro) 's Twitter Profile Photo

Last week we released the first paper of my PhD, "MEXMA: Token-level objectives improve sentence representations". We present a novel cross-lingual sentence encoder (CLSE) trained with both token and sentence-level objectives. Paper: arxiv.org/abs/2409.12737 1/n

Belen Alastruey (@b_alastruey) 's Twitter Profile Photo

Happy to share our team's work on Large Concept Models (LCMs), a new approach for language modeling that goes beyond standard token-based LLMs by operating in a multilingual and multimodal embedding space. Check out the full paper! 📄: ai.meta.com/research/publi…

Happy to share our team's work on Large Concept Models (LCMs), a new approach for language modeling that goes beyond standard token-based LLMs by operating in a multilingual and multimodal embedding space.  Check out the full paper!

📄:  ai.meta.com/research/publi…
Eduardo Sánchez (@eduardosg_ai) 's Twitter Profile Photo

Happy to see that Linguini, our benchmark for language-agnostic linguistic reasoning, has been included in DeepMind’s BIG-Bench Extra Hard (BBEH). Linguini remains challenging for reasoning models, being one of only two (hard) tasks where o3-mini doesn't show massive gains.

Happy to see that Linguini, our benchmark for language-agnostic linguistic reasoning, has been included in DeepMind’s BIG-Bench Extra Hard (BBEH).

Linguini remains challenging for reasoning models, being one of only two (hard) tasks where o3-mini doesn't show massive gains.