Felipe Cruz-Salinas (@fffffelipec) Twitter Tweets • TwiCopy

Cohere Labs

9 months ago

Introducing ✨ Aya Vision ✨ - an open-weights model to connect our world through language and vision Aya Vision adds breakthrough multimodal capabilities to our state-of-the-art multilingual 8B and 32B models. 🌿

thumb_up_off_alt465

chat_bubble_outline19

repeat126

shareShare

Command A(idan)

@aidangomez

9 months ago

Today cohere is very excited to introduce Command A, our new model succeeding Command R+. Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding usecases. 🧵

Today <a href="/cohere/">cohere</a> is very excited to introduce Command A, our new model succeeding Command R+. Command A is an open-weights 111B parameter model with a 256k context window focused on delivering great performance across agentic, multilingual, and coding usecases. 🧵

thumb_up_off_alt826

chat_bubble_outline30

repeat120

shareShare

Sander Land

@magikarp_tokens

9 months ago

Challenge and success, indeed Command A, indeed.

thumb_up_off_alt29

chat_bubble_outline2

repeat3

shareShare

Felipe Cruz-Salinas

@fffffelipec

9 months ago

This is what we've been hard at work for the last few months :) Command A is great at long context (256k easily), multilinguality, and throughput overall. Pre-training the base model and all the work leading up to that was super rewarding. I'm very happy it's out now 😌

thumb_up_off_alt49

chat_bubble_outline2

repeat6

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

9 months ago

🚀 Big news cohere's latest Command A now climbs to #13 on Arena! Another organization joining the top-15 club - congrats to the Cohere team! Highlights: - open-weight model (111B) - 256K context window - $2.5/$10 input/output MTok More analysis👇

🚀 Big news <a href="/cohere/">cohere</a>'s latest Command A now climbs to #13 on Arena!

Another organization joining the top-15 club - congrats to the Cohere team!

Highlights:
- open-weight model (111B)
- 256K context window
- $2.5/$10 input/output MTok

More analysis👇

thumb_up_off_alt241

chat_bubble_outline3

repeat42

shareShare

Ivan Zhang

@1vnzh

9 months ago

shoutout taylor swift

thumb_up_off_alt26

chat_bubble_outline0

repeat4

shareShare

Max Bartolo

@max_nlp

8 months ago

📄 You can find the full tech report at cohere.com/research/paper…

thumb_up_off_alt29

chat_bubble_outline1

repeat5

shareShare

Felipe Cruz-Salinas

@fffffelipec

8 months ago

push it to the [infinite width] limit youtube.com/watch?v=pP6ee4…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

cohere

@cohere

8 months ago

We’re redefining what’s possible with AI. With the release of our latest model, Command A, optimized for real-world agentic and multilingual tasks, we’re demonstrating our commitment to bringing enterprises AI that goes beyond the ordinary, and offers security & efficiency.

thumb_up_off_alt116

chat_bubble_outline9

repeat29

shareShare

Simo Ryu

@cloneofsimo

8 months ago

This year is really full of muP!!

thumb_up_off_alt23

chat_bubble_outline0

repeat1

shareShare

Marzieh Fadaee

@mziizm

7 months ago

1/ Science is only as strong as the benchmarks it relies on. So how fair—and scientifically rigorous—is today’s most widely used evaluation benchmark? We took a deep dive into Chatbot Arena to find out. 🧵

thumb_up_off_alt76

chat_bubble_outline3

repeat21

shareShare

Irem Ergün

@irombie

7 months ago

I'm excited to share our new pre-print ShiQ: Bringing back Bellman to LLMs! arxiv.org/abs/2505.11081 In this work, we propose a new, Q-learning inspired RL algorithm for finetuning LLMs 🎉 (1/n)

thumb_up_off_alt225

chat_bubble_outline12

repeat38

shareShare

Felipe Cruz-Salinas

@fffffelipec

6 months ago

Fun sonnet 4 hallucination on muP The Yang-Lecun correspondence

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Sander Land

@magikarp_tokens

6 months ago

🔠 UTF-8 was never meant for language models. Yet every major tokenizer still uses it, creating unfair "byte premiums". Why should your native script cost more to tokenize? It's time for a change. 🧵👇

thumb_up_off_alt291

chat_bubble_outline5

repeat35

shareShare

Sander Land

@magikarp_tokens

6 months ago

6/6 👇 Full paper & code 👇 📄 arxiv.org/abs/2505.24689

thumb_up_off_alt30

chat_bubble_outline2

repeat4

shareShare

Cohere Labs

@cohere_labs

6 months ago

How can we make language models more flexible to adapt to new languages after pretraining? 🌏 🧠 Our latest work investigates whether a tokenizer trained on more languages than the pretraining target can improve language plasticity without compromising pretraining performance.

thumb_up_off_alt79

chat_bubble_outline1

repeat20

shareShare

Diana Abagyan

@dianaabagyan

6 months ago

A huge thank you to all of my mentors and collaborators, especially Ahmet Üstün, Sara Hooker, Alejandro, and Marzieh Fadaee for their guidance and support ✨ 📜Check out our paper! arxiv.org/abs/2506.10766

thumb_up_off_alt16

chat_bubble_outline1

repeat6

shareShare

Felipe Cruz-Salinas

@fffffelipec

5 months ago

This is very cool. One of the reasons I think muP hasn't caught on is that it is not seamlessly integrated with torch. Optax can make some things annoying, but this one is nice :)

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare