Sam Skjonsberg (@codeviking) 's Twitter Profile
Sam Skjonsberg

@codeviking

Mostly code. Always coffee. Sometimes bikes. Views are my own.

ID: 119916149

linkhttp://codeviking.net calendar_today04-03-2010 23:10:59

1,1K Tweet

290 Followers

417 Following

Sam Skjonsberg (@codeviking) 's Twitter Profile Photo

I'm hiring. Come help us build SLURM 2.0 and (try to) make GPUs behave: job-boards.greenhouse.io/thealleninstit… Ai2 is a one of a kind place and my team is doing great work. Join us!

Nathan Lambert (@natolambert) 's Twitter Profile Photo

Ai2 released OLMoE today. It's our best model to date. - 1.3B active, 6.9B total parameters, 64 experts per layer - Trained on 5T tokens from DCLM baseline + Dolma - New preview of Tulu 3 post training recipe - Fully open source - Actually SOTA for ~1B active params I'm most

Ai2 released OLMoE today. It's our best model to date.
- 1.3B active, 6.9B total parameters, 64 experts per layer
- Trained on 5T tokens from DCLM baseline + Dolma
- New preview of Tulu 3 post training recipe
- Fully open source
- Actually SOTA for ~1B active params

I'm most
Ai2 (@allen_ai) 's Twitter Profile Photo

We're hiring! Come help us redefine how researchers and engineers use state-of-the-art GPU clusters: job-boards.greenhouse.io/thealleninstit…

Sam Skjonsberg (@codeviking) 's Twitter Profile Photo

Come join my team! We work on really interesting systems that brilliant practitioners use to build and run frontier AI workloads.

Sam Skjonsberg (@codeviking) 's Twitter Profile Photo

Yet another big step forward for fully open models. It’s been really exciting to work on the underlying compute infrastructure. …and if that sounds fun / interesting (which it is) — join my team: job-boards.greenhouse.io/thealleninstit…

Kyle Lo (@kylelostat) 's Twitter Profile Photo

kicking off 2025 with our OLMo 2 tech report while payin homage to the sequelest of sequels 🫡 🚗 2 OLMo 2 Furious 🔥 is everythin we learned since OLMo 1, with deep dives into: 🚖 stable pretrain 🚔 lr anneal 🤝 data curricula 🤝 soups 🚘 tulu post-train 🚜 compute infra 👇🧵

kicking off 2025 with our OLMo 2 tech report while payin homage to the sequelest of sequels 🫡

🚗 2 OLMo 2 Furious 🔥 is everythin we learned since OLMo 1, with deep dives into:

🚖 stable pretrain
🚔 lr anneal 🤝 data curricula 🤝 soups
🚘 tulu post-train
🚜 compute infra

👇🧵
Ai2 (@allen_ai) 's Twitter Profile Photo

Can AI really help with literature reviews? 🧐 Meet Ai2 ScholarQA, an experimental solution that allows you to ask questions that require multiple scientific papers to answer. It gives more in-depth, detailed, and contextual answers with table comparisons, expandable sections

Can AI really help with literature reviews? 🧐

Meet Ai2 ScholarQA, an experimental solution that allows you to ask questions that require multiple scientific papers to answer. It gives more in-depth, detailed, and contextual answers with table comparisons, expandable sections
Ai2 (@allen_ai) 's Twitter Profile Photo

Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on

Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on
Sam Skjonsberg (@codeviking) 's Twitter Profile Photo

This is really cool. A SoTA model on your phone. Now I can chat with our models while I'm riding the ferry (though the group of regulars I sit w/ is also pretty fun). Yet another important release from Ai2 that both demonstrates what's possible and makes AI more accessible.

Luca Soldaini ✈️ ICLR 25 (@soldni) 's Twitter Profile Photo

So many tokens in PDFs 📜 yet so hard to extract them 🔎 Not anymore! olmOCR gives you plain text version of any doc you can think of: science papers, old scans, brochures with weird layouts, even handwriting ✍️ Try it today 👇

So many tokens in PDFs 📜 yet so hard to extract them 🔎

Not anymore! olmOCR  gives you plain text version of any doc you can think of: science papers, old scans, brochures with weird layouts, even handwriting ✍️ 

Try it today 👇
Ai2 (@allen_ai) 's Twitter Profile Photo

We’re excited to share some updates to Ai2 ScholarQA: 🗂️ You can now sign in via Google to save your query history across devices and browsers. 📚 We added 108M+ paper abstracts to our corpus - expect to get even better responses! ✨ The backbone model has been updated to the

We’re excited to share some updates to Ai2 ScholarQA:
🗂️ You can now sign in via Google to save your query history across devices and browsers.
📚 We added 108M+ paper abstracts to our corpus - expect to get even better responses!
✨ The backbone model has been updated to the
Ai2 (@allen_ai) 's Twitter Profile Photo

Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!

Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks.

Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!
Sam Skjonsberg (@codeviking) 's Twitter Profile Photo

My team had fun getting these up quickly. It definitely took a bit of elbow grease/GPU sorcery b/c the hardware is so new. Onward 🚀!

Ai2 (@allen_ai) 's Twitter Profile Photo

For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting? Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦

Ai2 (@allen_ai) 's Twitter Profile Photo

We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.

We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.
Ai2 (@allen_ai) 's Twitter Profile Photo

Have questions? We’re an open book! We’re excited to host an AMA to answer your Qs about OLMo, our family of open language models. 🗓️ When: May 8, 8-10 am PT 🌐 Where: r/huggingface 🧠 Why: Gain insights from our expert researchers Chat soon!

Have questions? We’re an open book!

We’re excited to host an AMA to answer your Qs about OLMo, our family of open language models. 

🗓️ When: May 8, 8-10 am PT
🌐 Where: r/huggingface
🧠 Why: Gain insights from our expert researchers

Chat soon!
Ai2 (@allen_ai) 's Twitter Profile Photo

We’re live on Reddit! Ask us Anything about our OLMo family of models. We have six of our researchers on hand to answer all your questions.

We’re live on Reddit! Ask us Anything about our OLMo family of models. We have six of our researchers on hand to answer all your questions.
Nathan Lambert (@natolambert) 's Twitter Profile Photo

My latest post: The American DeepSeek Project Build fully open models in the US in the next two years to enable a flourishing, global scientific AI ecosystem to balance China's surge in open-source and an alternative to building products ontop of leading closed models.

My latest post: The American DeepSeek Project

Build fully open models in the US in the next two years to enable a flourishing, global scientific AI ecosystem to balance China's surge in open-source and an alternative to building products ontop of leading closed models.
Ai2 (@allen_ai) 's Twitter Profile Photo

With fresh support of $75M from U.S. National Science Foundation and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

With fresh support of $75M from <a href="/NSF/">U.S. National Science Foundation</a> and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡