BigScience Large Model Training (@bigsciencellm) 's Twitter Profile
BigScience Large Model Training

@bigsciencellm

Follow the training of "BLOOM ๐ŸŒธ", the @BigScienceW multilingual 176B parameter open-science open-access language model, a research tool for the AI community.

ID: 1502036410081202180

linkhttps://bigscience.notion.site/BigScience-176B-Model-Training-ad073ca07cdf479398d5f95d88e218c4 calendar_today10-03-2022 21:40:30

129 Tweet

8,8K Takipรงi

1 Takip Edilen

Saulnier Lucile (@lucilesaulnier) 's Twitter Profile Photo

๐ŸŒธBigScience Research Workshop BLOOM's intermediate checkpoints have already shown some very cool capabilities! What's great about BLOOM is that you can ask it to generate the rest of a text - and this even if it is not yet fully trained yet! ๐Ÿ‘ถ ๐Ÿงต A thread with some examples

๐ŸŒธ<a href="/BigscienceW/">BigScience Research Workshop</a> BLOOM's intermediate checkpoints have already shown some very cool capabilities! 

What's great about BLOOM is that you can ask it to generate the rest of a text - and this even if it is not yet fully trained yet! ๐Ÿ‘ถ

๐Ÿงต A thread with some examples
BigScience Large Model Training (@bigsciencellm) 's Twitter Profile Photo

For 111 days, we've enjoyed world-class hardware stability and throughput thanks to the hard work of our friends at Genci, @INS2I_CNRS, Megatron & DeepSpeed. Having reached our objective earlier than expected, we'll keep training for a few more days. Stay tuned, more soon ;)

BigScience Research Workshop (@bigsciencew) 's Twitter Profile Photo

BLOOM is here. The largest open-access multilingual language model ever. Read more about it or get it at bigscience.huggingface.co/blog/bloom hf.co/bigscience/bloโ€ฆ

BLOOM is here. The largest open-access multilingual language model ever. Read more about it or get it at
bigscience.huggingface.co/blog/bloom
hf.co/bigscience/bloโ€ฆ
Hugging Face (@huggingface) 's Twitter Profile Photo

The Technology Behind BLOOM Training๐ŸŒธ Discover how BigScience Research Workshop used Microsoft Research DeepSpeed + NVIDIA Megatron-LM technologies to train the World's Largest Open Multilingual Language Model (BLOOM): huggingface.co/blog/bloom-megโ€ฆ

clem ๐Ÿค— (@clementdelangue) 's Twitter Profile Photo

What do Stability AI Emad #stablediffusion & BigScience Research Workshop Bloom - aka the coolest new models ;) - have in common? They both use a new gen of ML licenses aimed at making ML more open & inclusive while keeping it harder to do harm with them. So cool! huggingface.co/blog/open_rail

What do <a href="/StabilityAI/">Stability AI</a> <a href="/EMostaque/">Emad</a> #stablediffusion &amp; <a href="/BigscienceW/">BigScience Research Workshop</a> Bloom - aka the coolest new models ;) - have in common?

They both use a new gen of ML licenses aimed at making ML more open &amp; inclusive while keeping it harder to do harm with them. So cool!

huggingface.co/blog/open_rail
Niklas Muennighoff (@muennighoff) 's Twitter Profile Photo

Crosslingual Generalization through Multitask Finetuning ๐ŸŒธ Demo: huggingface.co/bigscience/bloโ€ฆ ๐Ÿ“œ arxiv.org/abs/2211.01786 ๐Ÿ’ปgithub.com/bigscience-worโ€ฆ We present BLOOMZ & mT0, a family of models w/ up to 176B params that follow human instructions in >100 languages zero-shot. 1/7

Crosslingual Generalization through Multitask Finetuning  ๐ŸŒธ

Demo: huggingface.co/bigscience/bloโ€ฆ
๐Ÿ“œ arxiv.org/abs/2211.01786
๐Ÿ’ปgithub.com/bigscience-worโ€ฆ

We present BLOOMZ &amp; mT0, a family of models w/ up to 176B params that follow human instructions in &gt;100 languages zero-shot. 1/7
clem ๐Ÿค— (@clementdelangue) 's Twitter Profile Photo

The Bloom paper is out. Looks like it's doing worse than current GPT3 API in zero-shot generation tasks in English but better than other open-source LLMs & better than all in zs multi-lingual (which was the main goal). Proud of the work from the community! arxiv.org/abs/2211.05100

The Bloom paper is out. Looks like it's doing worse than current GPT3 API in zero-shot generation tasks in English but better than other open-source LLMs &amp; better than all in zs multi-lingual (which was the main goal). Proud of the work from the community! arxiv.org/abs/2211.05100