Iz Beltagy (@i_beltagy) Twitter Tweets • TwiCopy

Iz Beltagy

2 years ago

This was a fun discussion. One thing we debated is compute cost, and how it will change in the future. My guess is that training cost will continue to increase, not to train larger models, but to train longer on more data and more modalities. On the other hand, inference cost

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Iz Beltagy

@i_beltagy

2 years ago

Heading to #ACL2023NLP in a few hours. Looking forward to chatting and catching up with everyone. I am particularly curious to hear people's thoughts about OLMo; what new ideas we should try, what model features you want to see .. etc I am also curious about startups, and

thumb_up_off_alt20

chat_bubble_outline0

repeat0

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

2 years ago

Announcing Dolma, the dataset for Ai2's LLM, OLMo. It's 3+ trillion tokens (web/papers/code/books/wiki). We hope it will facilitate study of LLMs & their behavior! Released on Hugging Face w ImpACT license huggingface.co/datasets/allen… Overview/datasheet blog.allenai.org/dolma-3-trilli…

Announcing Dolma, the dataset for <a href="/allen_ai/">Ai2</a>'s LLM, OLMo. It's 3+ trillion tokens (web/papers/code/books/wiki). We hope it will facilitate study of LLMs & their behavior!

Released on <a href="/huggingface/">Hugging Face</a> w ImpACT license huggingface.co/datasets/allen…

Overview/datasheet blog.allenai.org/dolma-3-trilli…

thumb_up_off_alt543

chat_bubble_outline20

repeat141

shareShare

Luca Soldaini ✈️ ICLR 25

@soldni

2 years ago

Just released v0.9.0 of the Dolma toolkit 🍇 Lots of goodies (dataset tokenization support, new taggers, data analysis, etc), but the one I'm most proud of is that we now have.... ✨ proper documentation 💫 check it out at github.com/allenai/dolma/…, or `pip install dolma` 😊

thumb_up_off_alt97

chat_bubble_outline3

repeat27

shareShare

Iz Beltagy

@i_beltagy

2 years ago

Excited to give a talk today at the ELLIS LLM symposium about OLMo and open language models sites.google.com/view/ellisfms2…

thumb_up_off_alt28

chat_bubble_outline1

repeat3

shareShare

Iz Beltagy

@i_beltagy

2 years ago

Thrilled to see OLMo featured in the NYT! Proud of my role alongside Mechanical Dirk in inspiring AI2 leadership to start this project. Got questions about OLMo? Happy to answer! nytimes.com/2023/10/19/tec…

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Iz Beltagy

@i_beltagy

2 years ago

DPO vs. PPO reminds me of the GloVe vs. word2vec narrative. Both GloVe and DPO distill the core insights of word2vec and PPO, simplify the math, which lead to a more accessible training approach that delivers results on par, if not superior, and is far simpler to implement.

thumb_up_off_alt29

chat_bubble_outline0

repeat3

shareShare

Hamish Ivison

@hamishivi

2 years ago

Check out the Tulu 2 suite 🐪, a set of Llama-2 models finetuned+DPO-trained on a mixture of publicly available datasets! Our best-performing models are competitive with SoTA open models on a range of benchmarks incl. AlpacaEval and MT-Bench. 📜Paper: arxiv.org/abs/2311.10702

thumb_up_off_alt131

chat_bubble_outline3

repeat31

shareShare

Iz Beltagy

@i_beltagy

2 years ago

Tulu2-DPO-70B is the new SoTA open model 💪

thumb_up_off_alt22

chat_bubble_outline0

repeat2

shareShare

Iz Beltagy

@i_beltagy

2 years ago

OLMo-7b is finally out 🎉, and we are releasing everything; weights, intermediate checkpoints, training code and logs, training data and toolkit, evaluation and adaptation code and data. Most of it has been released, and the rest is coming soon. OLMo-65b and Adapted OLMo-7b are

thumb_up_off_alt308

chat_bubble_outline6

repeat66

shareShare

Mechanical Dirk

@mechanicaldirk

2 years ago

blog.allenai.org/hello-olmo-a-t… Putting this together has been one hell of a ride, starting with a little proposal to train a few more tokens into the BLOOM model, to the complete LLM training suite it is now. Can't wait to do it again for the 65B!

thumb_up_off_alt31

chat_bubble_outline2

repeat3

shareShare

Kyle Lo

@kylelostat

2 years ago

storytime🦉📖 OLMo was actually a bottom-up effort: less than a year ago Mechanical Dirk Iz Beltagy were still talking about what this project might look like, recruiting folks like me & Luca Soldaini 🎀 to take lead of various tasks like data, our first brainstorming session was Feb 2nd

storytime🦉📖 OLMo was actually a bottom-up effort:

less than a year ago <a href="/mechanicaldirk/">Mechanical Dirk</a> <a href="/i_beltagy/">Iz Beltagy</a> were still talking about what this project might look like, recruiting folks like me & <a href="/soldni/">Luca Soldaini 🎀</a> to take lead of various tasks like data, our first brainstorming session was Feb 2nd

thumb_up_off_alt58

chat_bubble_outline3

repeat8

shareShare

Iz Beltagy

@i_beltagy

2 years ago

on arxiv

thumb_up_off_alt30

chat_bubble_outline3

repeat6

shareShare

Ai2

@allen_ai

a year ago

🎉We're celebrating our OLMo team's big win last night for Innovation of the Year at the GeekWire Awards! Thank you to the panel for recognizing our efforts towards open science, and to the other nominees for pushing us all towards better AI technology! geekwire.com/2024/geekwire-…

🎉We're celebrating our OLMo team's big win last night for Innovation of the Year at the <a href="/geekwire/">GeekWire</a> Awards! Thank you to the panel for recognizing our efforts towards open science, and to the other nominees for pushing us all towards better AI technology! geekwire.com/2024/geekwire-…

thumb_up_off_alt71

chat_bubble_outline9

repeat17

shareShare

Iz Beltagy

@i_beltagy

9 months ago

My startup is hiring an applied research scientist to contribute to the core AI technologies we are building. And if you happen to be at NeurIPS, consider talking to Sameer Singh, our CTO, to learn more about the role and the company. Job post: linkedin.com/jobs/view/4090…

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare