Ahmad Al-Dahle (@ahmad_al_dahle) 's Twitter Profile
Ahmad Al-Dahle

@ahmad_al_dahle

#Girldad of twins. Leading GenAI @ Meta (llama, imagine, meta ai and more)

ID: 1773086505684348928

linkhttps://ai.meta.com calendar_today27-03-2024 20:35:59

269 Tweet

19,19K Takipçi

85 Takip Edilen

Ahmad Al-Dahle (@ahmad_al_dahle) 's Twitter Profile Photo

As of today, Llama 4 Maverick offers a best-in-class performance to cost ratio with an experimental chat version scoring ELO of 1417 on LMArena. It's wild to think Llama was a research project a couple of years ago & amazing to see how much progress we've made in the last two

As of today, Llama 4 Maverick offers a best-in-class performance to cost ratio with an experimental chat version scoring ELO of 1417 on LMArena.

It's wild to think Llama was a research project a couple of years ago & amazing to see how much progress we've made in the last two
Xeophon (@thexeophon) 's Twitter Profile Photo

Llama 4 Maverick has big model smell, thank you so much AI at Meta 🙏🏼 I have some prompts for an upcoming eval and based on those I tested, it is on the level of other frontier models. Really happy :)

Philip Kiely (@philip_kiely) 's Twitter Profile Photo

Llama 4 (Maverick) easily one-shots my Brick Breaker vibe check Output speed for 700+ words felt on par with ChatGPT/Claude on a good day using vLLM, excited to see how much faster we can run it!

AK (@_akhaliq) 's Twitter Profile Photo

llama-4-scout-17b-16e-instruct prompt: write a p5.js script that shows a ball bouncing inside a spinning hexagon. The ball should be affected by gravity and friction, and it must bounce off the rotating walls realistically

xjdr (@_xjdr) 's Twitter Profile Photo

L4 Maverick feels very much like a smarter 4o to me. i feel pretty confident in saying that was an explicit goal. i can also confirm it works very well at 1M+ ctx len. i dont have any 10M+ ctx evals but i'll try to throw something together just to satisfy my own curiosity

Dmytro Dzhulgakov (@dzhulgakov) 's Twitter Profile Photo

Meta Llama, number four, Coming Saturday, explore! Zuck announces, proud and loud, Fans and devs, a buzzing crowd. Llama 4, it’s on the way, Fireworks AI scrambles—hey! Startups racing, GPUs hot, “Launch the model—wait we cannot!” Llama Llama, context long, Support is deep,

Terry Yue Zhuo (@terryyuezhuo) 's Twitter Profile Photo

Llama-4 Series on BigCodeBench-Hard *Inference via NVIDIA NIM Llama-4 Maverick Ranked 41th/192 Similar to Gemini-2.0-Flash-Thinking & GPT-4o-2024-05-13 29.1% Complete 25% Instruct Llama-4-Scout Ranked 97th/192 16.9% Complete 16.9% Instruct Also, new visuals on the leaderboard!

Llama-4 Series on BigCodeBench-Hard
*Inference via NVIDIA NIM

Llama-4 Maverick Ranked 41th/192
Similar to Gemini-2.0-Flash-Thinking & GPT-4o-2024-05-13
29.1% Complete
25% Instruct

Llama-4-Scout Ranked 97th/192
16.9% Complete
16.9% Instruct

Also, new visuals on the leaderboard!
rohan anil (@_arohan_) 's Twitter Profile Photo

Maverick is very similar cost / better perf to Gemini 2.0 Flash, I think it delivers. Inference wise its an interleaved MoE: - I am a bit paranoid how various deployment choices are: 1. For moe: do not drop tokens & pad!! 2. If you quantize heavily please test against model

xjdr (@_xjdr) 's Twitter Profile Photo

my detailed personal benchmarks ran overnight. - Scout is best at summarization and function calling. exactly what you want from a cheap long ctx model. this is going to be a workhorse in coding flows and RAG applications. the single shot ICL recall is very very good. -

Ahmad Al-Dahle (@ahmad_al_dahle) 's Twitter Profile Photo

We're glad to start getting Llama 4 in all your hands. We're already hearing lots of great results people are getting with these models. That said, we're also hearing some reports of mixed quality across different services. Since we dropped the models as soon as they were

Hatice Ozen (@ozenhati) 's Twitter Profile Photo

llama 4 scout on @groqinc paired with ElevenLabs is incredible for multilingual voice agents. insanely smooth even switching between different languages thanks to low latency. and for those who have been asking about its turkish - i've been testing and it's pretty good. :)

m_ric (@aymericroucher) 's Twitter Profile Photo

Llama-4-Maverick is CRAZY GOOD to power agents 🤯 It's now the top open model on smolagents LLM leaderboard, beating the much larger DeepSeek-R1! Congrats Thomas Scialom and team!

Llama-4-Maverick is CRAZY GOOD to power agents 🤯

It's now the top open model on smolagents LLM leaderboard, beating the much larger DeepSeek-R1!
Congrats <a href="/ThomasScialom/">Thomas Scialom</a> and team!
Artificial Analysis (@artificialanlys) 's Twitter Profile Photo

Llama 4 Intelligence Index Update: We have now replicated Meta’s claimed values for MMLU Pro and GPQA Diamond, pushing our Intelligence Index scores for both Scout and Maverick higher Key update details: ➤ We noted in our first post 48 hours ago that we noticed discrepancies

Llama 4 Intelligence Index Update: We have now replicated Meta’s claimed values for MMLU Pro and GPQA Diamond, pushing our Intelligence Index scores for both Scout and Maverick higher

Key update details:
➤ We noted in our first post 48 hours ago that we noticed discrepancies
Unsloth AI (@unslothai) 's Twitter Profile Photo

You can now run Llama 4 on your local device!🦙 We shrank Maverick (402B) from 400GB to 122GB (-70%). Scout: 115GB to 33.8GB (-75%) Our Dynamic 1.78bit GGUFs ensures optimal accuracy by selectively quantizing layers GGUFs: huggingface.co/collections/un… Guide: docs.unsloth.ai/basics/tutoria…

You can now run Llama 4 on your local device!🦙

We shrank Maverick (402B) from 400GB to 122GB (-70%). Scout: 115GB to 33.8GB (-75%)

Our Dynamic 1.78bit GGUFs ensures optimal accuracy by selectively quantizing layers

GGUFs: huggingface.co/collections/un…
Guide: docs.unsloth.ai/basics/tutoria…
Lysandre (@lysandrejik) 's Twitter Profile Photo

Since the initial Attention is All you Need, 300 architectures have been contributed to Transformers. See the rise and fall of these architectures over time; crazy to see how BERT remains on top, but Llama is catching up fast!