Zangwei Zheng (@zangweizheng) 's Twitter Profile
Zangwei Zheng

@zangweizheng

Ph.D. Candidate @NUSingapore | B.S. @NanjingUnivers1 | Optimizer & Efficient ML | AAAI Distinguished & ACL Outstanding Paper Award Winner

ID: 892590471742017537

linkhttp://zhengzangw.github.io calendar_today02-08-2017 03:38:46

50 Tweet

257 Followers

173 Following

Fuzhao Xue (Frio) (@xuefz) 's Twitter Profile Photo

Super thrilled to announce that I've been awarded the 2023 Google PhD Fellowship! Enormous gratitude to my wonderful mentors/advisors who championed my application: Mostafa Dehghani, Yang You, Aixin Sun 孙爱欣, and to all my incredible collaborators. A heartfelt thanks to Google AI and

HPC-AI Lab (@hpcailab) 's Twitter Profile Photo

📢 Join us for the HPC-AI Lab Public Seminar! 🔗 Registration: forms.gle/4anywqoXtSumHu… 🗓️ Date/Time: 29 Nov. 2023 (Wednesday), 1 PM to 2 PM 📍 Online via Zoom

📢 Join us for the HPC-AI Lab Public Seminar!

🔗 Registration: forms.gle/4anywqoXtSumHu…
🗓️ Date/Time: 29 Nov. 2023 (Wednesday), 1 PM to 2 PM 
📍 Online via Zoom
Fuzhao Xue (Frio) (@xuefz) 's Twitter Profile Photo

After a few amazing days at #EMNLP2023! Really nice to meet so many new and old friends! I’m going to attend #NeurIPS2023 to present #TokenCrisis paper. DM me! Looking forward to chatting about: - Foundation model scaling - Adaptive Computation - ML system - and anything fun!

Zangwei Zheng (@zangweizheng) 's Twitter Profile Photo

Find a repo collecting optimizers for neural networks. More than 200 optimizers were proposed but only a few are popular nowadays (Adam, Adafactor, etc.) The field of optimizers is truly survival of the fittest. github.com/zoq/Awesome-Op…

Zangwei Zheng (@zangweizheng) 's Twitter Profile Photo

After a few amazing days at #EMNLP2023! Really nice to meet so many new and old friends! I’m going to attend #NeurIPS2023 to present #SequenceScheduling paper. DM me! Looking forward to chatting about: - Optimizer Design - Efficient Training & Inference for LLM - & anything fun!

Yang You (@yangyou1991) 's Twitter Profile Photo

I am happy to share that our paper has been accepted by ICLR as an ORAL paper (1.2% acceptance rate). InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning arxiv.org/abs/2303.04947 InfoBatch randomly prunes a portion of less informative samples based on the

I am happy to share that our paper has been accepted by ICLR as an ORAL paper (1.2% acceptance rate).

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning

arxiv.org/abs/2303.04947

InfoBatch randomly prunes a portion of less informative samples based on the
Elie Bursztein (@elie) 's Twitter Profile Photo

[Weekend read] InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning - arxiv.org/abs/2303.04947 by Zangwei Zheng ,Yang You et al. This new plug-and-play technique that makes model training ~20% faster by pruning non informative examples. #AI #deeplearning

[Weekend read] InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning - arxiv.org/abs/2303.04947 by <a href="/ZangweiZheng/">Zangwei Zheng</a> ,<a href="/YangYou1991/">Yang You</a> et al. This new plug-and-play technique that makes model training ~20% faster by pruning non informative examples.
#AI #deeplearning
Fuzhao Xue (Frio) (@xuefz) 's Twitter Profile Photo

(1/5)🚀 Our OpenMoE Paper is out! 📄 Including: 🔍ALL Checkpoints 📊 In-depth MoE routing analysis 🤯Learning from mistakes & solutions Three important findings: (1) Context-Independent Specialization; (2) Early Routing Learning; (3) Drop-towards-the-End. Paper Link:

(1/5)🚀 Our OpenMoE Paper is out! 📄 Including:

🔍ALL Checkpoints
📊 In-depth MoE routing analysis
🤯Learning from mistakes &amp; solutions 

Three important findings:
(1) Context-Independent Specialization;
(2) Early Routing Learning;
(3) Drop-towards-the-End.

Paper Link:
AK (@_akhaliq) 's Twitter Profile Photo

Neural Network Diffusion Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also generate high-performing neural network parameters. Our approach is simple, utilizing an autoencoder and a

Yang You (@yangyou1991) 's Twitter Profile Photo

Exciting News from Open-Sora! 🚀 They've just made the ENTIRE suite of their video-generation model open source! Dive into the world of cutting-edge AI with access to model weights, comprehensive training source code, and detailed architecture insights. Start building your dream

Victor.Kai Wang (@victorkaiwang1) 's Twitter Profile Photo

🚀 Call for participation! Excited to announce the first Dataset Distillation Challenge at ECCV-2024 in Milan, Italy! Join us for an incredible event. 🌍 Details: dd-challenge.com 🧵(1/3)

🚀 Call for participation! Excited to announce the first Dataset Distillation Challenge at ECCV-2024 in Milan, Italy! Join us for an incredible event. 🌍 Details: dd-challenge.com 🧵(1/3)
Fuzhao Xue (Frio) (@xuefz) 's Twitter Profile Photo

Pretraining will end as data is the fossil fuel of AI. —Ilya Yes, around 20 months ago, we also tried to share that Token-Crisis is happening (arxiv.org/abs/2305.13230) People didn’t believe it too much at that time hahaha Token-crisis would happen if you have enough compute,

Yang Luo (@yangl_7) 's Twitter Profile Photo

Training-free Video Enhancement: Achieved 🎉 Nice work with Xuanlei Zhao Wenqi Shaw Victor.Kai Wang @VitaGroupUT Yang You et al. Non-trivial enhancement, training-free, and plug-and-play 🥳 Blog: oahzxl.github.io/Enhance_A_Vide… (🧵1/6)

Victor.Kai Wang (@victorkaiwang1) 's Twitter Profile Photo

Generating ~200 million parameters in just minutes! 🥳 Excited to share our work with Doven Tang , ZHAO WANGBO , and Yang You: 'Recurrent Diffusion for Large-Scale Parameter Generation' (RPG for short). Example: Obtain customized models using prompts (see below). (🧵1/8)

Yang You (@yangyou1991) 's Twitter Profile Photo

🚀 Introducing Open-Sora 2.0 — the open-source SOTA-level video generation model trained for just $200K! 🎬 11B model achieves on-par performance with HunyuanVideo & 30B Step-Video on 📐VBench & 📊Human Preference. ⚡ The training cost is significantly reduced. MovieGen 6144

🚀 Introducing Open-Sora 2.0 — the open-source SOTA-level video generation model trained for just $200K!

🎬 11B model achieves on-par performance with HunyuanVideo &amp; 30B Step-Video on 📐VBench &amp; 📊Human Preference.
⚡ The training cost is significantly reduced.  MovieGen 6144
Victor.Kai Wang (@victorkaiwang1) 's Twitter Profile Photo

Customizing Your LLMs in seconds using prompts🥳! Excited to share our latest work with HPC-AI Lab, VITA Group, Konstantin Schürholt, Yang You, Michael Bronstein, Damian Borth : Drag-and-Drop LLMs(DnD). 2 features: tuning-free, comparable or even better than full-shot tuning.(🧵1/8)