Iz Beltagy (@i_beltagy) 's Twitter Profile
Iz Beltagy

@i_beltagy

Cofounder @SpiffyAI, Research Lead building OLMo at @allenai_org, formerly @UTCompSci PhD.

ID: 417436602

calendar_today20-11-2011 23:02:37

140 Tweet

1,1K Followers

425 Following

Ai2 (@allen_ai) 's Twitter Profile Photo

In a special episode of AllenNLP's #NLPHighlights podcast, Pradeep Dasigi invited Iz Beltagy and Mechanical Dirk, the research and engineering leads of the OLMo project, to discuss building and open sourcing language models: soundcloud.com/nlp-highlights…

Iz Beltagy (@i_beltagy) 's Twitter Profile Photo

This was a fun discussion. One thing we debated is compute cost, and how it will change in the future. My guess is that training cost will continue to increase, not to train larger models, but to train longer on more data and more modalities. On the other hand, inference cost

Iz Beltagy (@i_beltagy) 's Twitter Profile Photo

Heading to #ACL2023NLP in a few hours. Looking forward to chatting and catching up with everyone. I am particularly curious to hear people's thoughts about OLMo; what new ideas we should try, what model features you want to see .. etc I am also curious about startups, and

Luca Soldaini 🎀 (@soldni) 's Twitter Profile Photo

Announcing Dolma, the dataset for Ai2's LLM, OLMo. It's 3+ trillion tokens (web/papers/code/books/wiki). We hope it will facilitate study of LLMs & their behavior! Released on Hugging Face w ImpACT license huggingface.co/datasets/allen… Overview/datasheet blog.allenai.org/dolma-3-trilli…

Announcing Dolma, the dataset for <a href="/allen_ai/">Ai2</a>'s LLM, OLMo. It's 3+ trillion tokens (web/papers/code/books/wiki). We hope it will facilitate study of LLMs &amp; their behavior!

Released on <a href="/huggingface/">Hugging Face</a> w ImpACT license huggingface.co/datasets/allen…

Overview/datasheet blog.allenai.org/dolma-3-trilli…
Luca Soldaini 🎀 (@soldni) 's Twitter Profile Photo

Just released v0.9.0 of the Dolma toolkit 🍇 Lots of goodies (dataset tokenization support, new taggers, data analysis, etc), but the one I'm most proud of is that we now have.... ✨ proper documentation 💫 check it out at github.com/allenai/dolma/…, or `pip install dolma` 😊

Just released v0.9.0 of the Dolma toolkit 🍇 Lots of goodies (dataset tokenization support, new taggers, data analysis, etc), but the one I'm most proud of is that we now have....

✨ proper documentation 💫

check it out at github.com/allenai/dolma/…, or `pip install dolma` 😊
Iz Beltagy (@i_beltagy) 's Twitter Profile Photo

Excited to give a talk today at the ELLIS LLM symposium about OLMo and open language models sites.google.com/view/ellisfms2…

Iz Beltagy (@i_beltagy) 's Twitter Profile Photo

Thrilled to see OLMo featured in the NYT! Proud of my role alongside Mechanical Dirk in inspiring AI2 leadership to start this project. Got questions about OLMo? Happy to answer! nytimes.com/2023/10/19/tec…

Iz Beltagy (@i_beltagy) 's Twitter Profile Photo

DPO vs. PPO reminds me of the GloVe vs. word2vec narrative. Both GloVe and DPO distill the core insights of word2vec and PPO, simplify the math, which lead to a more accessible training approach that delivers results on par, if not superior, and is far simpler to implement.

Hamish Ivison (@hamishivi) 's Twitter Profile Photo

Check out the Tulu 2 suite 🐪, a set of Llama-2 models finetuned+DPO-trained on a mixture of publicly available datasets! Our best-performing models are competitive with SoTA open models on a range of benchmarks incl. AlpacaEval and MT-Bench. 📜Paper: arxiv.org/abs/2311.10702

Check out the Tulu 2 suite 🐪, a set of Llama-2 models finetuned+DPO-trained on a mixture of publicly available datasets! Our best-performing models are competitive with SoTA open models on a range of benchmarks incl. AlpacaEval and MT-Bench.
📜Paper: arxiv.org/abs/2311.10702
Iz Beltagy (@i_beltagy) 's Twitter Profile Photo

OLMo-7b is finally out 🎉, and we are releasing everything; weights, intermediate checkpoints, training code and logs, training data and toolkit, evaluation and adaptation code and data. Most of it has been released, and the rest is coming soon. OLMo-65b and Adapted OLMo-7b are

OLMo-7b is finally out 🎉, and we are releasing everything; weights, intermediate checkpoints, training code and logs, training data and toolkit, evaluation and adaptation code and data. 

Most of it has been released, and the rest is coming soon. OLMo-65b and Adapted OLMo-7b are
Mechanical Dirk (@mechanicaldirk) 's Twitter Profile Photo

blog.allenai.org/hello-olmo-a-t… Putting this together has been one hell of a ride, starting with a little proposal to train a few more tokens into the BLOOM model, to the complete LLM training suite it is now. Can't wait to do it again for the 65B!

Kyle Lo (@kylelostat) 's Twitter Profile Photo

storytime🦉📖 OLMo was actually a bottom-up effort: less than a year ago Mechanical Dirk Iz Beltagy were still talking about what this project might look like, recruiting folks like me & Luca Soldaini 🎀 to take lead of various tasks like data, our first brainstorming session was Feb 2nd

storytime🦉📖 OLMo was actually a bottom-up effort:

less than a year ago <a href="/mechanicaldirk/">Mechanical Dirk</a> <a href="/i_beltagy/">Iz Beltagy</a> were still talking about what this project might look like, recruiting folks like me &amp; <a href="/soldni/">Luca Soldaini 🎀</a> to take lead of various tasks like data, our first brainstorming session was Feb 2nd
Ai2 (@allen_ai) 's Twitter Profile Photo

🎉We're celebrating our OLMo team's big win last night for Innovation of the Year at the GeekWire Awards! Thank you to the panel for recognizing our efforts towards open science, and to the other nominees for pushing us all towards better AI technology! geekwire.com/2024/geekwire-…

🎉We're celebrating our OLMo team's big win last night for Innovation of the Year at the <a href="/geekwire/">GeekWire</a> Awards! Thank you to the panel for recognizing our efforts towards open science, and to the other nominees for pushing us all towards better AI technology! geekwire.com/2024/geekwire-…