Afroz Mohiuddin (@afrozenator) 's Twitter Profile
Afroz Mohiuddin

@afrozenator

Llama at Meta🦙, x-Google Brain 🧠. Interested in Science, Psychology, Investing and generally everything.

Good Thoughts, Good Words, Good Deeds.

ID: 32336786

linkhttps://github.com/afrozenator calendar_today17-04-2009 07:00:02

475 Tweet

1,1K Followers

4,4K Following

Anthropic (@anthropicai) 's Twitter Profile Photo

New Anthropic research: Adding Error Bars to Evals. AI model evaluations don’t usually include statistics or uncertainty. We think they should. Read the blog post here: anthropic.com/research/stati…

Lilly (@lillybilly299) 's Twitter Profile Photo

When my mother first moved to Japan, she tried to jaywalk while pushing a stroller on an empty residential street. She was immediately stopped by an old, well dressed Japanese man who solemnly told her in perfect English "the downfall of society begins with the individual"

Nikunj Kothari (@nikunj) 's Twitter Profile Photo

I feel a little bit for the Google DeepMind team.. You build a world changing model and everyone is posting Ghibli-fied pictures instead. But this is the core problem with Google - they can build the best models in the world but if they don’t focus on the consumer experience

Aston Zhang (@astonzhangaz) 's Twitter Profile Photo

Our Llama 4’s industry leading 10M+ multimodal context length (20+ hours of video) has been a wild ride. The iRoPE architecture I’d been working on helped a bit with the long-term infinite context goal toward AGI. Huge thanks to my incredible teammates! 🚀Llama 4 Scout 🔹17B

Our Llama 4’s industry leading 10M+ multimodal context length (20+ hours of video) has been a wild ride. The iRoPE architecture I’d been working on helped a bit with the long-term infinite context goal toward AGI. Huge thanks to my incredible teammates!

🚀Llama 4 Scout
🔹17B
Sharan Narang (@sharan0909) 's Twitter Profile Photo

Very excited to share Llama 4 models with the world. The pre-training team has cooked over the past few months to launch Llama 4 Scout, Maverick, and Behemoth. A 🧵about pretraining Blog link: ai.meta.com/blog/llama-4-m…

Very excited to share Llama 4 models with the world. The pre-training team has cooked over the past few months to launch Llama 4 Scout, Maverick, and Behemoth.

A 🧵about pretraining

Blog link:  ai.meta.com/blog/llama-4-m…
AI at Meta (@aiatmeta) 's Twitter Profile Photo

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model

Today is the start of a new era of natively multimodal AI innovation.

Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick —  our most advanced models yet and the best in their class for multimodality.

Llama 4 Scout
• 17B-active-parameter model
Afroz Mohiuddin (@afrozenator) 's Twitter Profile Photo

Extremely proud to have pioneered large scale distillation for Maverick and really delighted to be working alongside an extremely talented team. We truly hope the OSS community enjoys the fruits of our labour.

Dieuwke Hupkes (@_dieuwke_) 's Twitter Profile Photo

So happy our new multilingual benchmark MultiLoKo is finally out (after some sweat and tears!) arxiv.org/abs/2504.10356 Multilingual eval for LLMs... could be better, and I hope MultiLoKo will help fill some gaps in it + help study design choices in benchmark design AI at Meta

So happy our new multilingual benchmark MultiLoKo is finally out (after some sweat and tears!)

arxiv.org/abs/2504.10356

Multilingual eval for LLMs... could be better, and I hope MultiLoKo will help fill some gaps in it + help study design choices in benchmark design

<a href="/metaai/">AI at Meta</a>
Lukasz Kaiser (@lukaszkaiser) 's Twitter Profile Photo

o3: Leibniz wanted a single calculus, every thought settled by calculation. A LLM does just that: it turns all words into numbers, learns the patterns that link them. Its CoT is the arithmetic of reasoning. The old universal logic lives now in silicon, humming behind each reply

Afroz Mohiuddin (@afrozenator) 's Twitter Profile Photo

“The test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function. One should, for example, be able to see that things are hopeless yet be determined to make them otherwise.” F. Scott Fitzgerald

“The test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function. One should, for example, be able to see that things are hopeless yet be determined to make them otherwise.”

F. Scott Fitzgerald
Afroz Mohiuddin (@afrozenator) 's Twitter Profile Photo

"Raffiniert ist der Herrgott, aber boshaft ist er nicht" (God is subtle*, but malicious he is not.) — Albert Einstein * Also translated as: tricky, crafty, shrewd, sophisticated

yung macro 宏观年少传奇 (@apralky) 's Twitter Profile Photo

this dostoevsky quote is a massive whitepill if your intuitions are statistically mature btw if he perma grinded like a good boy instead of "idling around" and "making errors" the probability that he'd have died as a noname normie journalist rounds to 1 if you pay attention

this dostoevsky quote is a massive whitepill if your intuitions are statistically mature btw

if he perma grinded like a good boy instead of "idling around" and "making errors" the probability that he'd have died as a noname normie journalist rounds to 1 

if you pay attention
Jacob Austin (@jacobaustin132) 's Twitter Profile Photo

Today we're putting out an update to the JAX TPU book, this time on GPUs. How do GPUs work, especially compared to TPUs? How are they networked? And how does this affect LLM training? 1/n

Today we're putting out an update to the JAX TPU book, this time on GPUs. How do GPUs work, especially compared to TPUs? How are they networked? And how does this affect LLM training? 1/n