Kaiqing Zhang (@kaiqingzhang) Twitter Tweets • TwiCopy

Kaiqing Zhang

@kaiqingzhang

+ Follow

Assistant Prof. @UofMaryland; Prev. {@MIT, @SimonsInstitute, @ECEILLINOIS, @Tsinghua_Uni}; Interested in Control, Game Theory, Machine Learning, and Robotics

ID: 1259001201141710848

linkhttps://kzhang66.github.io/ calendar_today09-05-2020 06:04:24

159 Tweet

1,1K Followers

298 Following

Kaiqing Zhang

@kaiqingzhang

a year ago

For those interested, you may find my tutorial slides (that keep updating) at kzhang66.github.io/slides_Kaiqing…

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

Kaiqing Zhang

@kaiqingzhang

a year ago

Check out Soheil Feizi’s new RELAI agents! Looks super cool

thumb_up_off_alt5

chat_bubble_outline1

repeat2

shareShare

Max has been a fantastic collaborator and colleague, inspired me to think a lot more about bridging practice and relevant theory. More importantly, his enthusiasm and strong work ethic made him an amazing mentor for young students! I am sure it will be a blast working with him!

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Kaiqing Zhang

@kaiqingzhang

10 months ago

A bit of good news 😉 Happy New Year everyone!

thumb_up_off_alt277

chat_bubble_outline29

repeat1

shareShare

Mingyang Liu

@liumy2010

5 months ago

We propose Unified Fine-Tuning (UFT) — a novel post-training framework that unifies Supervised Fine-Tuning (SFT) and Reinforcement Fine-Tuning (RFT), and outperforms both. 📈 Superior performance across model sizes and tasks. 📚 Theory-backed: Achieves exponential improvement in

thumb_up_off_alt16

chat_bubble_outline7

repeat3

shareShare

Danfei Xu

@danfei_xu

4 months ago

Russ's recent talk at Stanford has to be my favorite in the past couple of years. I have asked everyone in my lab to watch it. youtube.com/watch?v=TN1M6v… IMO our community has accrued a huge amount of "research debt" (analogous to "technical debt") through flashy demos and

thumb_up_off_alt252

chat_bubble_outline3

repeat33

shareShare

Russ Tedrake

@russtedrake

3 months ago

TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the

thumb_up_off_alt334

chat_bubble_outline2

repeat75

shareShare

Kaiqing Zhang

@kaiqingzhang

3 months ago

Super fun work on post-"co"-training multiple LLM agents using RL! Check Chanwoo Park 's thread below 🔽. Just accepted as ACL Main Conference lately, check it out if u are interested!

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Kaiqing Zhang

@kaiqingzhang

2 months ago

Thanks Chi Jin ! Looking forward as well :)!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare