Enze Xie (@xieenze_jr) Twitter Tweets • TwiCopy

Enze Xie

@xieenze_jr

+ Follow

Sr. Research Scientist at NVIDIA, doing GenAI, CS PhD from HKU MMLab, interned at NVIDIA.

ID: 1723702194380427264

linkhttps://xieenze.github.io/ calendar_today12-11-2023 14:00:10

49 Tweet

769 Followers

116 Following

AK

@_akhaliq

6 months ago

SANA-Sprint One-Step Diffusion with Continuous-Time Consistency Distillation

thumb_up_off_alt428

chat_bubble_outline10

repeat67

shareShare

The best few-step sampling model across the speed-memory frontier? 😱 Introducing SANA-Sprint in collaboration with the great SANA team! Beyond the results, perhaps more importantly, the work is about the recipe of SANA-Sprint. Code & model will be open ❤️ Let's go ⬇️

thumb_up_off_alt163

chat_bubble_outline12

repeat26

shareShare

Enze Xie

@xieenze_jr

6 months ago

👀SANA-1.5 4.8B checkpoint is released!🎉🥳 Much better than SANA-1.0 1.6B. We also release more training code e.g. FSDP support / webdataset loader / multi-scale image sampler. One-click start training and feel free to try! github.com/NVlabs/Sana?ta… We will release Sprint soon😎

thumb_up_off_alt148

chat_bubble_outline2

repeat26

shareShare

Enze Xie

@xieenze_jr

6 months ago

I will attend GTC 2025 at NVIDIA Santa Clara from March 17 to March 21, welcome coffee chat ! ☕️

thumb_up_off_alt8

chat_bubble_outline1

repeat0

shareShare

Enze Xie

@xieenze_jr

5 months ago

🚀 SANA 1.5 Update: Inference Scaling Now Open-Source! 🎉 📈 Breakthrough on GenEval benchmark: • SANA 1.5 + Inference Scaling: 0.81 → 0.96 (!!) 🎯 • SD 1.5 + Inference Scaling: 0.42 → 0.87 ⬆️ 💫 The secret sauce: 1. Generate n candidates 🎨 2. Pick top k with NVILA

thumb_up_off_alt210

chat_bubble_outline4

repeat54

shareShare

Enze Xie

@xieenze_jr

5 months ago

It is interesting to see the idea of NoPE (no Position Encoding) used in Llama-4! By the way, our work SANA was also inspired by Meta's original NoPE paper and is the first work to remove positional encoding in diffusion transformers (DiT) 😊

thumb_up_off_alt32

chat_bubble_outline2

repeat3

shareShare

Enze Xie

@xieenze_jr

5 months ago

I tried WriteHere's Deep Research report feature, and the results were truly impressive! 🤩 Congrats Yimeng Chen !

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Enze Xie

@xieenze_jr

5 months ago

I'll be attending ICLR from April 24th to 28th in Singapore—hope to see you there! I'd be delighted to grab coffee and chat about topics such as #GenerativeAI and #EfficientAI

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Krishna Mohan

@kmohan2006

4 months ago

Deep compression in vae is hard, but this paper beautifully explains how to achieve this

thumb_up_off_alt460

chat_bubble_outline6

repeat49

shareShare

Enze Xie

@xieenze_jr

4 months ago

I gave an oral presentation on our image generation foundation model, SANA, at ICLR 2025 Singapore. Welcome to follow! In addition, we are developing SANA Video model, stay tuned! ICLR oral replay: youtube.com/watch?v=rrKFyY…

thumb_up_off_alt113

chat_bubble_outline3

repeat13

shareShare

Sayak Paul

@risingsayak

4 months ago

I know you have always secretly craved a cool distillation script that actually gets results. That time has come 🤯 In collaboration w/ Junsong_Chen & Shuchen Xue, we present a Diffusers-compatible training script for SANA Sprint 🏃 Links ⬇️

I know you have always secretly craved a cool distillation script that actually gets results. That time has come 🤯

In collaboration w/ <a href="/lawrence_cjs/">Junsong_Chen</a> & Shuchen Xue, we present a Diffusers-compatible training script for SANA Sprint 🏃

Links ⬇️

thumb_up_off_alt71

chat_bubble_outline3

repeat8

shareShare

Enze Xie

@xieenze_jr

3 months ago

🚀 Fast-dLLM: 27.6× Faster Diffusion LLMs with KV Cache & Parallel Decoding 💥 Key Features🌟 - Block-Wise KV Cache Reuses 90%+ attention activations via bidirectional caching (prefix/suffix), enabling 8.1×–27.6× throughput gains with <2% accuracy loss 🔄 -

thumb_up_off_alt174

chat_bubble_outline8

repeat34

shareShare

Enze Xie

@xieenze_jr

3 months ago

🚀The code for Fast-dLLM is now open-source! 💥 Fast-dLLM achieves a 27.6× end-to-end speedup on 1024-token sequences with less than 2% accuracy drop. Check out the code here: github.com/NVlabs/Fast-dL…

thumb_up_off_alt28

chat_bubble_outline1

repeat6

shareShare

Enze Xie

@xieenze_jr

3 months ago

I will be attending CVPR from June 11-15. Welcome to meet me for coffee! I will also share our team's research on Efficient Image Generation (including SANA, DC-AE, VILA-U, HART) in two workshops on Jun 12. 我将在 6月11-15 参加 CVPR, 欢迎约咖啡！👀 coop-intelligence.github.io

thumb_up_off_alt64

chat_bubble_outline1

repeat1

shareShare

Enze Xie

@xieenze_jr

3 months ago

Thank you Felix for hosting this excellet workshop!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare