Sam Skjonsberg (@codeviking) Twitter Tweets • TwiCopy

Sam Skjonsberg

2 years ago

I'm hiring. Come help us build SLURM 2.0 and (try to) make GPUs behave: job-boards.greenhouse.io/thealleninstit… Ai2 is a one of a kind place and my team is doing great work. Join us!

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Ai2 released OLMoE today. It's our best model to date. - 1.3B active, 6.9B total parameters, 64 experts per layer - Trained on 5T tokens from DCLM baseline + Dolma - New preview of Tulu 3 post training recipe - Fully open source - Actually SOTA for ~1B active params I'm most

thumb_up_off_alt234

chat_bubble_outline7

repeat36

shareShare

Sam Skjonsberg

@codeviking

2 years ago

This is really neat. I had fun being a part of getting this out the door. Go Ai2 -- yet again pushing forward open science!

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Ai2

@allen_ai

a year ago

We're hiring! Come help us redefine how researchers and engineers use state-of-the-art GPU clusters: job-boards.greenhouse.io/thealleninstit…

thumb_up_off_alt9

chat_bubble_outline1

repeat1

shareShare

Sam Skjonsberg

@codeviking

a year ago

Come join my team! We work on really interesting systems that brilliant practitioners use to build and run frontier AI workloads.

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Sam Skjonsberg

@codeviking

a year ago

Yet another big step forward for fully open models. It’s been really exciting to work on the underlying compute infrastructure. …and if that sounds fun / interesting (which it is) — join my team: job-boards.greenhouse.io/thealleninstit…

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Kyle Lo

@kylelostat

a year ago

kicking off 2025 with our OLMo 2 tech report while payin homage to the sequelest of sequels 🫡 🚗 2 OLMo 2 Furious 🔥 is everythin we learned since OLMo 1, with deep dives into: 🚖 stable pretrain 🚔 lr anneal 🤝 data curricula 🤝 soups 🚘 tulu post-train 🚜 compute infra 👇🧵

thumb_up_off_alt365

chat_bubble_outline3

repeat70

shareShare

Ai2

@allen_ai

a year ago

Can AI really help with literature reviews? 🧐 Meet Ai2 ScholarQA, an experimental solution that allows you to ask questions that require multiple scientific papers to answer. It gives more in-depth, detailed, and contextual answers with table comparisons, expandable sections

thumb_up_off_alt221

chat_bubble_outline14

repeat74

shareShare

Ai2

@allen_ai

a year ago

Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on

thumb_up_off_alt1,1K

chat_bubble_outline154

repeat379

shareShare

Sam Skjonsberg

@codeviking

a year ago

This is really cool. A SoTA model on your phone. Now I can chat with our models while I'm riding the ferry (though the group of regulars I sit w/ is also pretty fun). Yet another important release from Ai2 that both demonstrates what's possible and makes AI more accessible.

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

a year ago

So many tokens in PDFs 📜 yet so hard to extract them 🔎 Not anymore! olmOCR gives you plain text version of any doc you can think of: science papers, old scans, brochures with weird layouts, even handwriting ✍️ Try it today 👇

thumb_up_off_alt376

chat_bubble_outline14

repeat37

shareShare

Ai2

@allen_ai

a year ago

We’re excited to share some updates to Ai2 ScholarQA: 🗂️ You can now sign in via Google to save your query history across devices and browsers. 📚 We added 108M+ paper abstracts to our corpus - expect to get even better responses! ✨ The backbone model has been updated to the

thumb_up_off_alt166

chat_bubble_outline3

repeat37

shareShare

Ai2

@allen_ai

a year ago

Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!

$Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!$

thumb_up_off_alt668

chat_bubble_outline29

repeat161

shareShare

Sam Skjonsberg

@codeviking

a year ago

My team had fun getting these up quickly. It definitely took a bit of elbow grease/GPU sorcery b/c the hardware is so new. Onward 🚀!

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Ai2

@allen_ai

a year ago

For years it’s been an open question — how much is a language model learning and synthesizing information, and how much is it just memorizing and reciting? Introducing OLMoTrace, a new feature in the Ai2 Playground that begins to shed some light. 🔦

thumb_up_off_alt638

chat_bubble_outline17

repeat167

shareShare

Ai2

@allen_ai

a year ago

We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.

thumb_up_off_alt417

chat_bubble_outline25

repeat97

shareShare

Ai2

@allen_ai

a year ago

Have questions? We’re an open book! We’re excited to host an AMA to answer your Qs about OLMo, our family of open language models. 🗓️ When: May 8, 8-10 am PT 🌐 Where: r/huggingface 🧠 Why: Gain insights from our expert researchers Chat soon!

thumb_up_off_alt52

chat_bubble_outline2

repeat13

shareShare

Ai2

@allen_ai

a year ago

We’re live on Reddit! Ask us Anything about our OLMo family of models. We have six of our researchers on hand to answer all your questions.

thumb_up_off_alt85

chat_bubble_outline3

repeat29

shareShare

Nathan Lambert

@natolambert

9 months ago

My latest post: The American DeepSeek Project Build fully open models in the US in the next two years to enable a flourishing, global scientific AI ecosystem to balance China's surge in open-source and an alternative to building products ontop of leading closed models.

thumb_up_off_alt600

chat_bubble_outline51

repeat77

shareShare

Ai2

@allen_ai

8 months ago

With fresh support of $75M from U.S. National Science Foundation and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

With fresh support of $75M from <a href="/NSF/">U.S. National Science Foundation</a> and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

thumb_up_off_alt659

chat_bubble_outline31

repeat64

shareShare