XetHub (@xetdata) 's Twitter Profile
XetHub

@xetdata

XetHub enables ML teams to collaborate effectively on massive datasets.

Now part of @HuggingFace 🤗!

ID: 1499078781390110721

linkhttps://huggingface.co/xet-team calendar_today02-03-2022 17:47:04

62 Tweet

368 Followers

13 Following

Ann Huang (@annintweetd) 's Twitter Profile Photo

It's the end of an era. We just shut down our XetHub servers after 658 days in production. 🪦 What did we learn? xethub.com/blog/shutting-…

Ann Huang (@annintweetd) 's Twitter Profile Photo

Deduplicating evolving datasets is a no-brainer - store differences instead of full versions of each one. But format matters! Here's how appends, modifications, and deletes on Apache Parquet files (~20% of what's stored on Hugging Face Hub) deduplicate. 🧵

Deduplicating evolving datasets is a no-brainer - store differences instead of full versions of each one. But format matters! 

Here's how appends, modifications, and deletes on <a href="/ApacheParquet/">Apache Parquet</a> files (~20% of what's stored on <a href="/huggingface/">Hugging Face</a> Hub) deduplicate. 🧵
Gradio (@gradio) 's Twitter Profile Photo

Welcome, Gradio 5 👋 We’ve been hard at work over the past few months, and we are excited to announce today the stable release of Gradio 5! With more than 2 million users every month (and >470,000 apps on Hugging Face Spaces), Gradio has become the default way to build,

Welcome, Gradio 5  👋

We’ve been hard at work over the past few months, and we are excited to  announce today the stable release of Gradio 5!

With more than 2 million users every month (and &gt;470,000 apps on Hugging Face Spaces), Gradio has become the default way to build,
Ann Huang (@annintweetd) 's Twitter Profile Photo

Did you know that @Huggingface Hub holds over 29 PB of Git LFS files across datasets, models, and spaces? 📈 That's the equivalent of 64 Common Crawl Foundation downloads - and it's growing every day. So what's inside? 🧵

Did you know that @Huggingface Hub holds over 29 PB of Git LFS files across datasets, models, and spaces? 📈

That's the equivalent of 64 <a href="/CommonCrawl/">Common Crawl Foundation</a> downloads - and it's growing every day. So what's inside? 🧵
Ann Huang (@annintweetd) 's Twitter Profile Photo

Sweet visualization of S3 PUT requests to @HuggingFace Hub over a 24-hour period, showing upload density across the globe. 🌏

Ann Huang (@annintweetd) 's Twitter Profile Photo

We're turning Hugging Face Hub's files into content-defined chunks to speed up your workflows!⚡️ This means: - 🧠We store your file as deduplicated chunks - ⏩ You only upload changed chunks when iterating! - 🚀 Pulling changes? Only download changed chunks!