Carlos Muñoz Ferrandis (@carlos_mferr) 's Twitter Profile
Carlos Muñoz Ferrandis

@carlos_mferr

Tech Lawyer I PhD Researcher l Views my own

ID: 1368606866016722944

linkhttp://linkedin.com/in/carlos-muñoz-ferrandis-a22592105 calendar_today07-03-2021 16:58:17

298 Tweet

521 Takipçi

692 Takip Edilen

Loubna Ben Allal (@loubnabenallal1) 's Twitter Profile Photo

🧵 Here's how we're tackling text de-identification to remove personal information from code datasets at BigCode: - An annotated benchmark 📑 - A pipeline for PII detection and anonymization 🚀 - A demo to visualize anonymized samples 🔍 (1/n)

🧵 Here's how we're tackling text de-identification to remove personal information from code datasets at <a href="/BigCodeProject/">BigCode</a>:

- An annotated benchmark 📑
- A pipeline for PII detection and anonymization 🚀
- A demo to visualize anonymized samples 🔍
(1/n)
BigCode (@bigcodeproject) 's Twitter Profile Photo

Between now and Christmas🎄 we are running a series on experiments to figure out what the best pre-processing is for code datasets such as The Stack. We'll share the W&B dashboards of these 🎅-models so if you are interested you can follow along!

Between now and Christmas🎄 we are running a series on experiments to figure out what the best pre-processing is for code datasets such as The Stack. We'll share the W&amp;B dashboards of these 🎅-models so if you are interested you can follow along!
BigCode (@bigcodeproject) 's Twitter Profile Photo

Today we are releasing The Stack v1.1! 🚀 We added more data, included more programming languages, and extended the list of permissive licenses used. huggingface.co/datasets/bigco… Also the first batch of opt-out requests was removed from the dataset.

Today we are releasing The Stack v1.1! 🚀

We added more data, included more programming languages, and extended the list of permissive licenses used. 

huggingface.co/datasets/bigco…

Also the first batch of opt-out requests was removed from the dataset.
JJ (@josephjacks_) 's Twitter Profile Photo

Sufficiently advanced Neural Networks are fundamentally a new species of computation. They are not source code. Thus, they cannot be licensed as open source (which was designed for source code)…

Yacine Jernite (@yjernite) 's Twitter Profile Photo

We're releasing our second Hugging Face Ethics&Society newsletter, it's a Big One 🤗 The Winter edition talks bias in ML, tools from the 🤗 team to address it, and the role of collaboration across the ML development chain in mitigating risks 🧑‍🤝‍🧑 hf.co/blog/ethics-so… 1/5

We're releasing our second <a href="/huggingface/">Hugging Face</a> Ethics&amp;Society newsletter, it's a Big One 🤗

The Winter edition talks bias in ML, tools from the 🤗 team to address it, and the role of collaboration across the ML development chain in mitigating risks 🧑‍🤝‍🧑

hf.co/blog/ethics-so…
1/5
clem 🤗 (@clementdelangue) 's Twitter Profile Photo

We crossed 100,000 public AI models on the Hugging Face hub available for free to all. Thank you to the whole community of contributors. Proud to make ML more open & collaborative!

We crossed 100,000 public AI models on the <a href="/huggingface/">Hugging Face</a> hub available for free to all. Thank you to the whole community of contributors. Proud to make ML more open &amp; collaborative!
BigCode (@bigcodeproject) 's Twitter Profile Photo

Announcing a holiday gift: 🎅SantaCoder - a 1.1B multilingual LM for code that outperforms much larger open-source models on both left-to-right generation and infilling! Demo: hf.co/spaces/bigcode… Paper: hf.co/datasets/bigco… Attribution: hf.co/spaces/bigcode… A🧵:

Announcing a holiday gift: 🎅SantaCoder - a 1.1B multilingual LM for code that outperforms much larger open-source models on both left-to-right generation and infilling!

Demo: hf.co/spaces/bigcode…
Paper: hf.co/datasets/bigco…
Attribution: hf.co/spaces/bigcode…

A🧵:
BigCode (@bigcodeproject) 's Twitter Profile Photo

Finally, we summarized our findings in a technical report with a wonderful group of collaborators: Paper: hf.co/datasets/bigco… So what's next? Scaling to larger models and training more languages early next year! 🚀

Finally, we summarized our findings in a technical report with a wonderful group of collaborators:

Paper: hf.co/datasets/bigco…

So what's next? Scaling to larger models and training more languages early next year! 🚀
Hugging Face (@huggingface) 's Twitter Profile Photo

It's been an exciting year for 🤗Transformers. We tripled the number of weekly active users over 2022, with over 1M users most weeks now and 300k daily pip installs on average🤯

It's been an exciting year for 🤗Transformers. We tripled the number of weekly active users over 2022, with over 1M users most weeks now and 300k daily pip installs on average🤯
RAIL (@responsibleail) 's Twitter Profile Photo

We are now inviting participation from the community to help shape the future of Responsible AI Licenses (RAIL)! RAIL has recently been adopted by many new AI materials (e.g. BigCode)...