Huayu Chen (@chenhuay17) Twitter Tweets • TwiCopy

Huayu Chen

@chenhuay17

+ Follow

Phd candidate at TsinghuaSAIL

ID: 1847135162636947456

linkhttp://chendrag.github.io calendar_today18-10-2024 04:38:48

7 Tweet

22 Followers

57 Following

Huayu Chen

@chenhuay17

a year ago

Introduce our recent work. Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment arxiv.org/abs/2410.09347 CCA vastly improves fid/is performance of pretrained autoregressive visual models by only 1 epoch of fintuning. Requires only pretraining data.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Huayu Chen

@chenhuay17

a year ago

Guidance-Free Training Add <10 lines of code to achieve guidance-free visual sampling. Same or even better performance with CFG on 5 distinct models across diffusion/AR/masked. Support from-scratch training. Require no extra GPU memory. Check out arxiv.org/abs/2501.15420

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Yin Cui

@yincuicv

8 months ago

Is self-improvement exclusive to RL? Can we use supervised learning to match LLMs trained with SOTA RL algorithms? In Negative-aware Fine-Tuning (NFT), we introduce a purely supervised learning method to enhance LLMs' math reasoning with no external teachers. NFT matches or

thumb_up_off_alt185

chat_bubble_outline3

repeat39

shareShare

Haoxiang Wang

@haoxiang__wang

8 months ago

🚀Excited to share our new LLM math reasoning work! 🔥Supervised learning (as a replacement for RL) can reach SoTA performance on LLM math reasoning! 📊

thumb_up_off_alt11

chat_bubble_outline0

repeat4

shareShare

Ganqu Cui

@charlesfornlp

8 months ago

So many works talking about entropy, but what is the **mechanism** of entropy in RL for LLMs? 🤔 Our work gives a principled understanding, as well as two tricks that get entropy **controlled** 🧵

thumb_up_off_alt125

chat_bubble_outline3

repeat16

shareShare

Huayu Chen

@chenhuay17

5 months ago

Checkout our group's recent work.

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Aran Komatsuzaki

@arankomatsuzaki

4 months ago

DiffusionNFT: RL for diffusion models via the forward process • Contrastive fine-tuning: positives vs negatives → implicit policy improvement • Works with any solver, no CFG, no trajectory storage • 25× more efficient than FlowGRPO • Boosts SD3.5-M: GenEval 0.24 → 0.98 in

thumb_up_off_alt277

chat_bubble_outline2

repeat42

shareShare

Ming-Yu Liu

@liu_mingyu

3 months ago

Looking for an RL algorithm for improving your diffusion models? DiffusionNFT might be able to help. Check it out. github.com/NVlabs/Diffusi… 25x more efficient than FlowGRPO and gives you SOTA results on various benchmarks. with Huayu Chen Kaiwen Zheng Qinsheng Zhang Haoxiang Wang

thumb_up_off_alt24

chat_bubble_outline0

repeat5

shareShare

Huayu Chen

@chenhuay17

3 months ago

Ultrafast diffusion distillation

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Huayu Chen

@chenhuay17

a month ago

Check out Haotian Ye @ NeurIPS25's work on regularized video diffusion RL. It is amazing how simple data regularization turns out to be so effective in preventing hacking problem and boost quality.

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare