Pradeep Dasigi (@pdasigi) Twitter Tweets • TwiCopy

Costa Huang

a year ago

😆 So happy OLMo 2 is out! We applied the same Tülu 3 RLVR recipe and it worked very nicely for our final 13B instruct model. Here are the gains/losses of allenai/OLMo-2-1124-13B-Instruct (RLVR's checkpoint) over hf-allenai_OLMo-2-1124-13B-DPO. More to share soon!

thumb_up_off_alt41

chat_bubble_outline6

repeat4

shareShare

Ai2

@allen_ai

a year ago

Calling all predoctoral candidates: our OLMo team is hiring! Apply to be a Predoctoral Young Investigator today at the link in-thread 🧵

thumb_up_off_alt94

chat_bubble_outline2

repeat14

shareShare

Saumya Malik

@saumyamalik44

a year ago

I'm having a great time as a PYI at Ai2! Consider applying for this great program :)

thumb_up_off_alt33

chat_bubble_outline0

repeat10

shareShare

Interconnects

@interconnectsai

a year ago

OpenAI's o1 using "search" was a PSYOP How to understand OpenAI's o1 models as really just one wacky, wonderful, long chain of thought. interconnects.ai/p/openais-o1-u…

thumb_up_off_alt171

chat_bubble_outline5

repeat22

shareShare

Pradeep Dasigi

@pdasigi

a year ago

Our team at Ai2 (OLMo) is looking for a predoctoral researcher. You get to work on exciting research in building open LMs while preparing for a PhD. Apply here: job-boards.greenhouse.io/thealleninstit…

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Ai2

@allen_ai

a year ago

Remember Molmo? The full recipe is finally out! Training code, data, and everything you need to reproduce our models. Oh, and we have updated our tech report too! Links in thread 👇

thumb_up_off_alt401

chat_bubble_outline7

repeat91

shareShare

Faeze Brahman

@faeze_brh

a year ago

Just arrived in 🇨🇦 to attend NeurIPS 2024! Excited to connect and chat about AI reliability and safety, resource-efficient approaches to AI alignment , inference-time scaling and anything in between! You can drop me a message/email ([email protected]) or find me at the

thumb_up_off_alt42

chat_bubble_outline3

repeat3

shareShare

Pradeep Dasigi

@pdasigi

10 months ago

Here's a significant update to Tülu 3: we scaled up the post-training recipe to Llama 3.1 405B. Tülu 3 405B beats Llama's 405B instruct model and also Deepseek V3. You can now access the model and the entire post-training pipeline. Huge shoutout to Hamish Ivison and Costa Huang who

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Hamish Ivison

@hamishivi

10 months ago

One additional thing in the updated Tulu 3 paper that I'd like to highlight is that Pradeep Dasigi went back and re-evaluated our mid-stage checkpoints on our held-out evals (Section 7.4). This lets us see what decisions generalized beyond the exact test sets we used! I think this is

One additional thing in the updated Tulu 3 paper that I'd like to highlight is that <a href="/pdasigi/">Pradeep Dasigi</a> went back and re-evaluated our mid-stage checkpoints on our held-out evals (Section 7.4).

This lets us see what decisions generalized beyond the exact test sets we used! I think this is

thumb_up_off_alt10

chat_bubble_outline1

repeat1

shareShare

Pradeep Dasigi

@pdasigi

10 months ago

Thanks a lot for having me!

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Hanna Hajishirzi

@hannahajishirzi

9 months ago

Excited to drive innovation and push the boundaries of open, scientific AI research & development! 🚀 Join us at Ai2 to shape the future of OLMo, Molmo, Tulu, and more. We’re hiring at all levels—apply now! 👇 #AI #Hiring Research Engineer job-boards.greenhouse.io/thealleninstit… Research

thumb_up_off_alt66

chat_bubble_outline1

repeat15

shareShare

Ai2

@allen_ai

9 months ago

Introducing olmOCR, our open-source tool to extract clean plain text from PDFs! Built for scale, olmOCR handles many document types with high throughput. Run it on your own GPU for free—at over 3000 token/s, equivalent to $190 per million pages, or 1/32 the cost of GPT-4o!

thumb_up_off_alt1,1K

chat_bubble_outline85

repeat266

shareShare

Pradeep Dasigi

@pdasigi

9 months ago

How to curate instruction tuning datasets while targeting specific skills? This is a common question developers face while post-training LMs. In this work led by Hamish Ivison, we found that simple embedding based methods scale much better than fancier computationally intensive

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Sasha Rush

@srush_nlp

9 months ago

Really interesting paper on data selection. At this point just read everything Hamish writes.

thumb_up_off_alt144

chat_bubble_outline5

repeat19

shareShare

Ai2

@allen_ai

9 months ago

Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!

$Announcing OLMo 2 32B: the first fully open model to beat GPT 3.5 & GPT-4o mini on a suite of popular, multi-skill benchmarks. Comparable to best open-weight models, but a fraction of training compute. When you have a good recipe, ✨ magical things happen when you scale it up!$

thumb_up_off_alt668

chat_bubble_outline29

repeat161

shareShare

Nathan Lambert

@natolambert

9 months ago

A very exciting day for open-source AI! We're releasing our biggest open source model yet -- OLMo 2 32B -- and it beats the latest GPT 3.5, GPT 4o mini, and leading open weight models like Qwen and Mistral. As usual, all data, weights, code, etc. are available. For a long time,

thumb_up_off_alt956

chat_bubble_outline51

repeat155

shareShare

Ai2

@allen_ai

7 months ago

We're excited to round out the OLMo 2 family with its smallest member, OLMo 2 1B, surpassing peer models like Gemma 3 1B or Llama 3.2 1B. The 1B model should enable rapid iteration for researchers, more local development, and a more complete picture of how our recipe scales.

thumb_up_off_alt417

chat_bubble_outline25

repeat97

shareShare

Jesse Dodge

@jessedodge

6 months ago

Percy Liang EleutherAI nice! we also recently trained a set of models on 25 different pretraining corpora, each corpus having 14 model sizes trained (4M to 1B), to 5x Chinchilla. We released 30,000+ checkpoints! x.com/allen_ai/statu… arxiv.org/pdf/2504.11393

thumb_up_off_alt48

chat_bubble_outline0

repeat8

shareShare

Valentina Pyatkin

@valentina__py

5 months ago

💡Beyond math/code, instruction following with verifiable constraints is suitable to be learned with RLVR. But the set of constraints and verifier functions is limited and most models overfit on IFEval. We introduce IFBench to measure model generalization to unseen constraints.

thumb_up_off_alt347

chat_bubble_outline5

repeat89

shareShare

Pradeep Dasigi

@pdasigi

4 months ago

Huge thanks to U.S. National Science Foundation and NVIDIA for supporting our work on building truly open AI! This is a huge deal!

thumb_up_off_alt17

chat_bubble_outline0

repeat0

shareShare