Enze Xie (@xieenze_jr) 's Twitter Profile
Enze Xie

@xieenze_jr

Sr. Research Scientist at NVIDIA, doing GenAI, CS PhD from HKU MMLab, interned at NVIDIA.

ID: 1723702194380427264

linkhttps://xieenze.github.io/ calendar_today12-11-2023 14:00:10

49 Tweet

769 Takipçi

116 Takip Edilen

Sayak Paul (@risingsayak) 's Twitter Profile Photo

The best few-step sampling model across the speed-memory frontier? 😱 Introducing SANA-Sprint in collaboration with the great SANA team! Beyond the results, perhaps more importantly, the work is about the recipe of SANA-Sprint. Code & model will be open ❤️ Let's go ⬇️

The best few-step sampling model across the speed-memory frontier? 😱

Introducing SANA-Sprint in collaboration with the great SANA team!

Beyond the results, perhaps more importantly, the work is about the recipe of SANA-Sprint. Code & model will be open ❤️

Let's go ⬇️
Enze Xie (@xieenze_jr) 's Twitter Profile Photo

👀SANA-1.5 4.8B checkpoint is released!🎉🥳 Much better than SANA-1.0 1.6B. We also release more training code e.g. FSDP support / webdataset loader / multi-scale image sampler. One-click start training and feel free to try! github.com/NVlabs/Sana?ta… We will release Sprint soon😎

Enze Xie (@xieenze_jr) 's Twitter Profile Photo

🚀 SANA 1.5 Update: Inference Scaling Now Open-Source! 🎉 📈 Breakthrough on GenEval benchmark: • SANA 1.5 + Inference Scaling: 0.81 → 0.96 (!!) 🎯 • SD 1.5 + Inference Scaling: 0.42 → 0.87 ⬆️ 💫 The secret sauce: 1. Generate n candidates 🎨 2. Pick top k with NVILA

🚀 SANA 1.5 Update: Inference Scaling Now Open-Source! 🎉

📈 Breakthrough on GenEval benchmark:
• SANA 1.5 + Inference Scaling: 0.81 → 0.96 (!!) 🎯
• SD 1.5 + Inference Scaling: 0.42 → 0.87 ⬆️

💫 The secret sauce:
1. Generate n candidates 🎨
2. Pick top k with NVILA
Enze Xie (@xieenze_jr) 's Twitter Profile Photo

It is interesting to see the idea of NoPE (no Position Encoding) used in Llama-4! By the way, our work SANA was also inspired by Meta's original NoPE paper and is the first work to remove positional encoding in diffusion transformers (DiT) 😊

Enze Xie (@xieenze_jr) 's Twitter Profile Photo

I'll be attending ICLR from April 24th to 28th in Singapore—hope to see you there! I'd be delighted to grab coffee and chat about topics such as #GenerativeAI and #EfficientAI

I'll be attending ICLR from April 24th to 28th in Singapore—hope to see you there! I'd be delighted to grab coffee and chat about topics such as #GenerativeAI and #EfficientAI
Enze Xie (@xieenze_jr) 's Twitter Profile Photo

I gave an oral presentation on our image generation foundation model, SANA, at ICLR 2025 Singapore. Welcome to follow! In addition, we are developing SANA Video model, stay tuned! ICLR oral replay: youtube.com/watch?v=rrKFyY…

Sayak Paul (@risingsayak) 's Twitter Profile Photo

I know you have always secretly craved a cool distillation script that actually gets results. That time has come 🤯 In collaboration w/ Junsong_Chen & Shuchen Xue, we present a Diffusers-compatible training script for SANA Sprint 🏃 Links ⬇️

I know you have always secretly craved a cool distillation script that actually gets results. That time has come 🤯

In collaboration w/ <a href="/lawrence_cjs/">Junsong_Chen</a> &amp; Shuchen Xue, we present a Diffusers-compatible training script for SANA Sprint 🏃

Links ⬇️
Enze Xie (@xieenze_jr) 's Twitter Profile Photo

🚀 Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache & Parallel Decoding 💥 Key Features🌟 - Block-Wise KV Cache Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6× throughput gains with <2% accuracy loss 🔄 -

🚀 Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache &amp; Parallel Decoding 💥  

Key Features🌟  
- Block-Wise KV Cache  
  Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6× throughput gains with &lt;2% accuracy loss 🔄  
-
Enze Xie (@xieenze_jr) 's Twitter Profile Photo

🚀The code for Fast-dLLM is now open-source! 💥 Fast-dLLM achieves a 27.6× end-to-end speedup on 1024-token sequences with less than 2% accuracy drop. Check out the code here: github.com/NVlabs/Fast-dL…

Enze Xie (@xieenze_jr) 's Twitter Profile Photo

I will be attending CVPR from June 11-15. Welcome to meet me for coffee! I will also share our team's research on Efficient Image Generation (including SANA, DC-AE, VILA-U, HART) in two workshops on Jun 12. 我将在 6月11-15 参加 CVPR, 欢迎约咖啡!👀 coop-intelligence.github.io