Hu Xu (@hu_hsu) 's Twitter Profile
Hu Xu

@hu_hsu

Facebook AI Research, MetaCLIP, Data Research

ID: 2874803443

linkhttps://howardhsu.github.io/ calendar_today24-10-2014 07:43:07

157 Tweet

510 Followers

596 Following

Hu Xu (@hu_hsu) 's Twitter Profile Photo

Great to see MetaCLIP algorithm (arxiv.org/abs/2309.16671) desaturate SSL training distribution as SSL 2.0. What’s next in SSL or pre-training? From our data research perspective, it’s likely about how to automatically desaturate a training distribution.

Hu Xu (@hu_hsu) 's Twitter Profile Photo

Heading to #ICML2025 (first time). Excited to meet need friends and old friends and chat about foundational data research and co-design with training (MetaCLIP), SelfCite arxiv.org/abs/2502.09604 with Yung-Sung Chuang and LongVU arxiv.org/abs/2410.17434 with XIAOQIAN SHEN .

Hu Xu (@hu_hsu) 's Twitter Profile Photo

Thanks for the invited talk and happy to share our industrial insights on “scaling data alignment” from Meta CLIP (its wide adoption and what’s next) in the DataWorld workshop #ICML2025 . happy to chat offline about data research.

Zeyuan Allen-Zhu, Sc.D. (@zeyuanallenzhu) 's Twitter Profile Photo

Phase 1 of Physics of Language Models code release ✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours ✅Canon layers = strong, scalable gains ✅Real open-source (data/train/weights) ✅Apache 2.0 license (commercial ok!) 🔗github.com/facebookresear…

Phase 1 of Physics of Language Models code release
✅our Part 3.1 + 4.1 = all you need to pretrain strong 8B base model in 42k GPU-hours
✅Canon layers = strong, scalable gains
✅Real open-source (data/train/weights)
✅Apache 2.0 license (commercial ok!)
🔗github.com/facebookresear…
Hu Xu (@hu_hsu) 's Twitter Profile Photo

Great to see an intern project grows into a big project that is landing many impacts. Thanks for the hard work throughout the way.

Hu Xu (@hu_hsu) 's Twitter Profile Photo

As LLM research transitions into large-scale production and intense competition, momentum in areas less directly related to LLMs (like CLIP) has slowed and de-focused. We hope these fields can endure and prove essential for long-term scientific progress.

Hu Xu (@hu_hsu) 's Twitter Profile Photo

Genie 3 by Google DeepMind looks impressive. Extending Sora/Veo-style text-to-video generation with multi-round 'camera prompts' is an exciting direction. I believe the action space in world models goes far beyond human interaction through camera prompts—it should encompass much

Tim Rocktäschel (@_rockt) 's Twitter Profile Photo

"We don't just passively perceive the world; we actively generate it. The real world drives our perceptions, but the brain is always making best guesses. Perception is a controlled hallucination, constrained by sensory signals from the outside world" — Anil Seth

Hu Xu (@hu_hsu) 's Twitter Profile Photo

Truly appreciate the authors of Molmo Molmo (from Ai2 and University of Washington) for promoting open research and adopting MetaCLIP. There are many forms of openness today—such as open APIs, open weights, and open-source for reproducibility etc. I view MetaCLIP and Molmo's research

Truly appreciate the authors of Molmo <a href="/Molmo_AI/">Molmo</a>  (from <a href="/allen_ai/">Ai2</a> and <a href="/UW/">University of Washington</a>) for promoting open research and adopting  MetaCLIP. There are many forms of openness today—such as open APIs, open weights, and open-source for reproducibility etc. I view MetaCLIP and Molmo's research