AI21 Labs(@AI21Labs) 's Twitter Profileg
AI21 Labs

@AI21Labs

AI21 Labs builds Foundation Models and AI Systems for the enterprise that accelerate the use of GenAI in production.

🥂Meet Jamba
https://t.co/xUBjKZHKVH

ID:1166332664569368576

linkhttp://www.ai21.com calendar_today27-08-2019 12:52:51

259 Tweets

6,2K Followers

90 Following

NYSE 🏛(@NYSE) 's Twitter Profile Photo

.Ori Goshen, Co-Founder + Co-CEO of AI21 Labs, talks about growth opportunities for the company following a $208 million Series C funding round, and shares his perspective on the future of AI on with Judy Khan Shaw

account_circle
AI21 Labs(@AI21Labs) 's Twitter Profile Photo

Building a RAG solution is easy. Building a great one is not.

In our guest blog on Streamlit, our team explores the intricacies of how AI21's Contextual Answers Task-Specific Model & our RAG Engine generate context-based answers grounded in your proprietary organizational data.…

account_circle
meng shao(@shao__meng) 's Twitter Profile Photo

Paper of Jamba: A Hybrid Transformer-Mamba Language Model

Jamba, a novel architecture which combines Attention and Mamba layers, with MoE modules, and an open implementation of it, reaching state-of-the-art performance and supporting long contexts.

We showed how Jamba provides…

Paper of Jamba: A Hybrid Transformer-Mamba Language Model Jamba, a novel architecture which combines Attention and Mamba layers, with MoE modules, and an open implementation of it, reaching state-of-the-art performance and supporting long contexts. We showed how Jamba provides…
account_circle
AK(@_akhaliq) 's Twitter Profile Photo

Jamba

A Hybrid Transformer-Mamba Language Model

present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of

Jamba A Hybrid Transformer-Mamba Language Model present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of
account_circle
Philipp Schmid(@_philschmid) 's Twitter Profile Photo

Last Week AI21 Labs released the production-scale Mamba implementation, and today, they released their paper. 🧐
Jamba introduces a new hybrid Transformer-Mamba mixture-of-experts architecture offering state-of-the-art performance but with significant improvements on long…

Last Week @AI21Labs released the production-scale Mamba implementation, and today, they released their paper. 🧐 Jamba introduces a new hybrid Transformer-Mamba mixture-of-experts architecture offering state-of-the-art performance but with significant improvements on long…
account_circle
Philipp Schmid(@_philschmid) 's Twitter Profile Photo

Yesterday AI21 Labs released Jamba, the first production-scale Mamba implementation as a hybrid SSM-Transformer MoE 🐍 And today, you can already finetune it with Hugging Face TRL.

Alexander Doria shared a working Qlora script using 4-bit quantization on an A100 GPU (for now,…

Yesterday AI21 Labs released Jamba, the first production-scale Mamba implementation as a hybrid SSM-Transformer MoE 🐍 And today, you can already finetune it with Hugging Face TRL. @Dorialexander shared a working Qlora script using 4-bit quantization on an A100 GPU (for now,…
account_circle
1LittleCoder💻(@1littlecoder) 's Twitter Profile Photo

🥹 Jamba is truly amazing!

Everyone speaks about Long Context. But it's been mostly useful for ingesting in-context learning.

But Jamba seems to be the first Model offering a great throughput even for higher context!

🥹 Jamba is truly amazing! Everyone speaks about Long Context. But it's been mostly useful for ingesting in-context learning. But Jamba seems to be the first Model offering a great throughput even for higher context!
account_circle
AI21 Labs(@AI21Labs) 's Twitter Profile Photo

This Jamba overview is a great way to quickly understand the novel features Jamba brings to the dev community. Thanks Ai Flux!

account_circle
Maxime Labonne(@maximelabonne) 's Twitter Profile Photo

I played a little with Jamba: it looks like an amazing model.

In terms of architecture, the MoE implementation is very close to Mixtral's. What's great about it is that it hasn't been fine-tuned. Curious to see how much improvement we can get through SFT.

I made a little…

I played a little with Jamba: it looks like an amazing model. In terms of architecture, the MoE implementation is very close to Mixtral's. What's great about it is that it hasn't been fine-tuned. Curious to see how much improvement we can get through SFT. I made a little…
account_circle
Aleksa Gordić 🍿🤖(@gordic_aleksa) 's Twitter Profile Photo

Extremely cool new model release from AI21 Labs - Jamba - and it's not even a transformer! It's a hybrid model that combines Mamba (structured state space model), transformer layers and MoE technique, and it's a first production-grade Mamba based model!

* It's a 52B MoE with 12B…

Extremely cool new model release from @AI21Labs - Jamba - and it's not even a transformer! It's a hybrid model that combines Mamba (structured state space model), transformer layers and MoE technique, and it's a first production-grade Mamba based model! * It's a 52B MoE with 12B…
account_circle
swyx(@swyx) 's Twitter Profile Photo

incredibly impressed by AI21 Labs' Jamba today. This is the first legitimate Mixtral-killer we've seen and it came out of 'nowhere':

buttondown.email/ainews/archive…

They've helped me redefine my idea of a model 'weight class' from 'number of parameters' (increasingly outdated with MoEs…

account_circle