Zhen Liu (@itsthezhen) Twitter Tweets • TwiCopy

Zhen Liu

@itsthezhen

+ Follow

Assistant Prof @ CUHK-SZ. PhD at @Mila_Quebec & @UMontreal, MS & BS at @GeorgiaTech and past visitor at @MPI_IS. Deep Learning / 3D vision / 3D Content Gen.

ID: 4157036115

linkhttp://itszhen.com calendar_today10-11-2015 14:43:10

84 Tweet

329 Takipçi

301 Takip Edilen

Dinghuai Zhang 张鼎怀

@zdhnarsil

4 months ago

Your verl & vllm is secrectly giving your off-policy (when using quantized rollout) and you should treat it as an off-policy problem! How? As a probabilistic guy we shud say (truncated) importance sampling 😀 Check Feng Yao 's tweet here！

thumb_up_off_alt38

chat_bubble_outline0

repeat5

shareShare

Zhen Liu

@itsthezhen

2 months ago

TL;DR: Meet BesiegeField—a playground where LLMs build, test, and refine machines from standard parts in real time. We tested agentic workflows and RLVR with top LLMs: even the strongest still show limits in compositional machine design. 🔗 besiegefield.github.io 🧵 below

thumb_up_off_alt7

chat_bubble_outline0

repeat3

shareShare

Weiyang Liu

@besteuler

a month ago

🤯 Merging many finetuned LLMs into one model, effectively? Introducing Functional Dual Anchor (FDA), a new framework for model merging. 🚀 Current merging works poorly due to the underlying parameter conflicts. FDA shifts knowledge integration to the input-representation space

thumb_up_off_alt216

chat_bubble_outline3

repeat39

shareShare

Weiyang Liu

@besteuler

21 days ago

🤩 This is awesome. When we are doing the agentic design project (besiegefield.github.io) using the Besiege game environment, we have to hack the game to get as much feedback as possible to do RL and stuff. However, I start to think differently after seeing the Genshin agent.

thumb_up_off_alt36

chat_bubble_outline0

repeat9

shareShare