Pradeep Dasigi
@pdasigi
Senior Research Scientist @allen_ai; #NLProc, Post-training for OLMo
ID: 20038834
https://pdasigi.github.io/ 04-02-2009 09:00:52
442 Tweet
1,1K Followers
505 Following
Just arrived in 🇨🇦 to attend NeurIPS 2024! Excited to connect and chat about AI reliability and safety, resource-efficient approaches to AI alignment , inference-time scaling and anything in between! You can drop me a message/email ([email protected]) or find me at the
Here's a significant update to Tülu 3: we scaled up the post-training recipe to Llama 3.1 405B. Tülu 3 405B beats Llama's 405B instruct model and also Deepseek V3. You can now access the model and the entire post-training pipeline. Huge shoutout to Hamish Ivison and Costa Huang who
One additional thing in the updated Tulu 3 paper that I'd like to highlight is that Pradeep Dasigi went back and re-evaluated our mid-stage checkpoints on our held-out evals (Section 7.4). This lets us see what decisions generalized beyond the exact test sets we used! I think this is
Excited to drive innovation and push the boundaries of open, scientific AI research & development! 🚀 Join us at Ai2 to shape the future of OLMo, Molmo, Tulu, and more. We’re hiring at all levels—apply now! 👇 #AI #Hiring Research Engineer job-boards.greenhouse.io/thealleninstit… Research
How to curate instruction tuning datasets while targeting specific skills? This is a common question developers face while post-training LMs. In this work led by Hamish Ivison, we found that simple embedding based methods scale much better than fancier computationally intensive
Percy Liang EleutherAI nice! we also recently trained a set of models on 25 different pretraining corpora, each corpus having 14 model sizes trained (4M to 1B), to 5x Chinchilla. We released 30,000+ checkpoints! x.com/allen_ai/statu… arxiv.org/pdf/2504.11393