Kaiqing Zhang (@kaiqingzhang) 's Twitter Profile
Kaiqing Zhang

@kaiqingzhang

Assistant Prof. @UofMaryland; Prev. {@MIT, @SimonsInstitute, @ECEILLINOIS, @Tsinghua_Uni}; Interested in Control, Game Theory, Machine Learning, and Robotics

ID: 1259001201141710848

linkhttps://kzhang66.github.io/ calendar_today09-05-2020 06:04:24

159 Tweet

1,1K Followers

298 Following

Kaiqing Zhang (@kaiqingzhang) 's Twitter Profile Photo

Max has been a fantastic collaborator and colleague, inspired me to think a lot more about bridging practice and relevant theory. More importantly, his enthusiasm and strong work ethic made him an amazing mentor for young students! I am sure it will be a blast working with him!

Mingyang Liu (@liumy2010) 's Twitter Profile Photo

We propose Unified Fine-Tuning (UFT) β€” a novel post-training framework that unifies Supervised Fine-Tuning (SFT) and Reinforcement Fine-Tuning (RFT), and outperforms both. πŸ“ˆ Superior performance across model sizes and tasks. πŸ“š Theory-backed: Achieves exponential improvement in

We propose Unified Fine-Tuning (UFT) β€” a novel post-training framework that unifies Supervised Fine-Tuning (SFT) and Reinforcement Fine-Tuning (RFT), and outperforms both.

πŸ“ˆ Superior performance across model sizes and tasks.
πŸ“š Theory-backed: Achieves exponential improvement in
Danfei Xu (@danfei_xu) 's Twitter Profile Photo

Russ's recent talk at Stanford has to be my favorite in the past couple of years. I have asked everyone in my lab to watch it. youtube.com/watch?v=TN1M6v… IMO our community has accrued a huge amount of "research debt" (analogous to "technical debt") through flashy demos and

Russ Tedrake (@russtedrake) 's Twitter Profile Photo

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the

Kaiqing Zhang (@kaiqingzhang) 's Twitter Profile Photo

Super fun work on post-"co"-training multiple LLM agents using RL! Check Chanwoo Park 's thread below πŸ”½. Just accepted as ACL Main Conference lately, check it out if u are interested!