Acyr Locatelli (@acyr_l) 's Twitter Profile
Acyr Locatelli

@acyr_l

Lead pre-training @Cohere

ID: 294045100

calendar_today06-05-2011 12:47:58

84 Tweet

563 Followers

870 Following

Cohere Labs (@cohere_labs) 's Twitter Profile Photo

Introducing ✨Aya Expanse ✨ – an open-weights state-of-art family of models to help close the language gap with AI. Aya Expanse is both global and local. Driven by a multi-year commitment to multilingual research. cohere.com/research/aya

Max Bartolo (@max_nlp) 's Twitter Profile Photo

Our Command R+ model is one of TIME's 200 Best Inventions of 2024! 🚀 Try it out at coral.cohere.com 🌐 time.com/collection/bes…

Laura Ruis (@lauraruis) 's Twitter Profile Photo

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️
Cohere Labs (@cohere_labs) 's Twitter Profile Photo

A moment for @Cohere and Cohere For AI team appreciation. 💙 #NeurIPS2024 - stop by the booth to catch up with our team, or find us throughout the conference.

A moment for @Cohere and <a href="/CohereForAI/">Cohere For AI</a> team appreciation. 💙

#NeurIPS2024 - stop by the booth to catch up with our team, or find us throughout the conference.
cohere (@cohere) 's Twitter Profile Photo

Introducing Command R7B: the smallest, fastest, and final model in our R series of enterprise-focused LLMs! It delivers a powerful combination of state-of-the-art performance in its class and efficiency to lower the cost of building AI applications. cohere.com/blog/command-r…

Acyr Locatelli (@acyr_l) 's Twitter Profile Photo

I'm hiring performance engineers for the pre-training team at Cohere. If you enjoy writing efficient kernels, hardware-aligned architecture design and optimisations, do reach out! Check out the live job posting here: jobs.ashbyhq.com/cohere/d42f5fd…

AK (@_akhaliq) 's Twitter Profile Photo

Cohere releases Command A on Hugging Face Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases. Command A is on par or better than models like GPT-4o and Deepseek

Cohere releases Command A on Hugging Face

Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding use cases.

Command A is on par or better than models like GPT-4o and Deepseek
lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

🚀 Big news cohere's latest Command A now climbs to #13 on Arena! Another organization joining the top-15 club - congrats to the Cohere team! Highlights: - open-weight model (111B) - 256K context window - $2.5/$10 input/output MTok More analysis👇

🚀 Big news <a href="/cohere/">cohere</a>'s latest Command A now climbs to #13 on Arena!

Another organization joining the top-15 club - congrats to the Cohere team!

Highlights:
- open-weight model (111B)
- 256K context window
- $2.5/$10 input/output MTok

More analysis👇
Cohere Labs (@cohere_labs) 's Twitter Profile Photo

Excited to announce that @Cohere and Cohere Labs models are the first supported inference provider on Hugging Face Hub! 🔥 Looking forward to this new avenue for sharing and serving our models, including the Aya family and Command suite of models.

Excited to announce that @Cohere and <a href="/Cohere_Labs/">Cohere Labs</a> models are the first supported inference provider on <a href="/huggingface/">Hugging Face</a> Hub! 🔥

Looking forward to this new avenue for sharing and serving our models, including the Aya family and Command suite of models.
Nando de Freitas (@nandodf) 's Twitter Profile Photo

RL is not all you need, nor attention nor Bayesianism nor free energy minimisation, nor an age of first person experience. Such statements are propaganda. You need thousands of people working hard on data pipelines, scaling infrastructure, HPC, apps with feedback to drive

Sara Hooker (@sarahookr) 's Twitter Profile Photo

Very proud of this work which is being presented ICLR 2026 later today. While I will not be there — Catch up with Viraat Aryabumi and Ahmet Üstün who are both fantastic and can share more about our work at both Cohere Labs and cohere. 🔥✨

Very proud of this work which is being presented <a href="/iclr_conf/">ICLR 2026</a> later today. While I will not be there — Catch up with <a href="/viraataryabumi/">Viraat Aryabumi</a> and <a href="/ahmetustun89/">Ahmet Üstün</a> who are both fantastic and can share more about our work at both <a href="/Cohere_Labs/">Cohere Labs</a> and <a href="/cohere/">cohere</a>. 🔥✨