BigScience Large Model Training (@bigsciencellm) 's Twitter Profile
BigScience Large Model Training

@bigsciencellm

Follow the training of "BLOOM 🌸", the @BigScienceW multilingual 176B parameter open-science open-access language model, a research tool for the AI community.

ID: 1502036410081202180

linkhttps://bigscience.notion.site/BigScience-176B-Model-Training-ad073ca07cdf479398d5f95d88e218c4 calendar_today10-03-2022 21:40:30

129 Tweet

8,8K Followers

1 Following

Saulnier Lucile (@lucilesaulnier) 's Twitter Profile Photo

🌸BigScience Research Workshop BLOOM's intermediate checkpoints have already shown some very cool capabilities! What's great about BLOOM is that you can ask it to generate the rest of a text - and this even if it is not yet fully trained yet! πŸ‘Ά 🧡 A thread with some examples

🌸<a href="/BigscienceW/">BigScience Research Workshop</a> BLOOM's intermediate checkpoints have already shown some very cool capabilities! 

What's great about BLOOM is that you can ask it to generate the rest of a text - and this even if it is not yet fully trained yet! πŸ‘Ά

🧡 A thread with some examples
BigScience Large Model Training (@bigsciencellm) 's Twitter Profile Photo

For 111 days, we've enjoyed world-class hardware stability and throughput thanks to the hard work of our friends at Genci, @INS2I_CNRS, Megatron & DeepSpeed. Having reached our objective earlier than expected, we'll keep training for a few more days. Stay tuned, more soon ;)

BigScience Research Workshop (@bigsciencew) 's Twitter Profile Photo

BLOOM is here. The largest open-access multilingual language model ever. Read more about it or get it at bigscience.huggingface.co/blog/bloom hf.co/bigscience/blo…

BLOOM is here. The largest open-access multilingual language model ever. Read more about it or get it at
bigscience.huggingface.co/blog/bloom
hf.co/bigscience/blo…
Hugging Face (@huggingface) 's Twitter Profile Photo

The Technology Behind BLOOM Training🌸 Discover how BigScience Research Workshop used Microsoft Research DeepSpeed + NVIDIA Megatron-LM technologies to train the World's Largest Open Multilingual Language Model (BLOOM): huggingface.co/blog/bloom-meg…

clem πŸ€— (@clementdelangue) 's Twitter Profile Photo

What do Stability AI Emad #stablediffusion & BigScience Research Workshop Bloom - aka the coolest new models ;) - have in common? They both use a new gen of ML licenses aimed at making ML more open & inclusive while keeping it harder to do harm with them. So cool! huggingface.co/blog/open_rail

What do <a href="/StabilityAI/">Stability AI</a> <a href="/EMostaque/">Emad</a> #stablediffusion &amp; <a href="/BigscienceW/">BigScience Research Workshop</a> Bloom - aka the coolest new models ;) - have in common?

They both use a new gen of ML licenses aimed at making ML more open &amp; inclusive while keeping it harder to do harm with them. So cool!

huggingface.co/blog/open_rail
Niklas Muennighoff (@muennighoff) 's Twitter Profile Photo

Crosslingual Generalization through Multitask Finetuning 🌸 Demo: huggingface.co/bigscience/blo… πŸ“œ arxiv.org/abs/2211.01786 πŸ’»github.com/bigscience-wor… We present BLOOMZ & mT0, a family of models w/ up to 176B params that follow human instructions in >100 languages zero-shot. 1/7

Crosslingual Generalization through Multitask Finetuning  🌸

Demo: huggingface.co/bigscience/blo…
πŸ“œ arxiv.org/abs/2211.01786
πŸ’»github.com/bigscience-wor…

We present BLOOMZ &amp; mT0, a family of models w/ up to 176B params that follow human instructions in &gt;100 languages zero-shot. 1/7
clem πŸ€— (@clementdelangue) 's Twitter Profile Photo

The Bloom paper is out. Looks like it's doing worse than current GPT3 API in zero-shot generation tasks in English but better than other open-source LLMs & better than all in zs multi-lingual (which was the main goal). Proud of the work from the community! arxiv.org/abs/2211.05100

The Bloom paper is out. Looks like it's doing worse than current GPT3 API in zero-shot generation tasks in English but better than other open-source LLMs &amp; better than all in zs multi-lingual (which was the main goal). Proud of the work from the community! arxiv.org/abs/2211.05100