Paul Zhou (@zhiyuan_zhou_) 's Twitter Profile
Paul Zhou

@zhiyuan_zhou_

CS Ph.D. student at UC Berkeley

ID: 1712920268845522944

linkhttp://zhouzypaul.github.io calendar_today13-10-2023 19:56:38

80 Tweet

345 Followers

221 Following

Arthur Allshire (@arthurallshire) 's Twitter Profile Photo

our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and sitting on chairs in a single policy (w/ Hongsuk Benjamin Choi Junyi Zhang David McAllister)

RoboPapers (@robopapers) 's Twitter Profile Photo

Full episode dropping soon! Geeking out with Paul Zhou on AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World auto-eval.github.io Co-hosted by Chris Paxton & Michael Cho - Rbt/Acc

Paul Zhou (@zhiyuan_zhou_) 's Twitter Profile Photo

This was fun thanks for having me Chris Paxton Michael Cho - Rbt/Acc! See the podcast for some livestream of the robot in real time and me evaluating a policy live! Or check it out for yourself at auto-eval.github.io and eval your policy in real without breaking a sweat

RoboPapers (@robopapers) 's Twitter Profile Photo

Full episode dropping soon! Geeking out with Paul Zhou on AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World auto-eval.github.io Co-hosted by Chris Paxton & Michael Cho - Rbt/Acc

Paul Zhou (@zhiyuan_zhou_) 's Twitter Profile Photo

Yes! Lets build a network of distributed eval stations together 🦾 With our open sourced framework it now only takes 3-5 hours to set up a new AutoEval station! We have released a detailed step by step guide.

Chris Paxton (@chris_j_paxton) 's Twitter Profile Photo

One of the biggest challenges in robot learning is that we don't have a solution for comparable, reproducible evaluation of different methods. Enter AutoEval, which allows you to (1) test methods on known problems via web api, and (2) gives you all the tools you need to set up

Michael Cho - Rbt/Acc (@micoolcho) 's Twitter Profile Photo

In my view, real world eval is an even bigger bottleneck than lack of data in robotics. We need more attempts like AutoEval to creatively think of ways to scale evals/benchmarking. Great fun chatting with Paul Zhou & Chris Paxton

Ville 🤖 (@villekuosmanen) 's Twitter Profile Photo

This is a very cool project and I can see why access to safe, autonomous, and robust API-first eval cells around the world could be a useful commercial product!

Yifei Zhou (@yifeizhou02) 's Twitter Profile Photo

With previous research in multimodal and agents, I believe the only truly useful multimodal agent before 2027 is multimodal co-creations in structured formats. Sharing my first blogpost, cuz I do not quite see this point of view around but can be quite impacful to the society.

Kevin Frans (@kvfrans) 's Twitter Profile Photo

Over the past year, I've been compiling some "alchemist's notes" on deep learning. Right now it covers basic optimization, architectures, and generative models. Focus is on learnability -- each page has nice graphics and an end-to-end implementation. notes.kvfrans.com

Over the past year, I've been compiling some "alchemist's notes" on deep learning. Right now it covers basic optimization, architectures, and generative models.

Focus is on learnability -- each page has nice graphics and an end-to-end implementation.

notes.kvfrans.com
Kevin Frans (@kvfrans) 's Twitter Profile Photo

Stare at policy improvement and diffusion guidance, and you may notice a suspicious similarity... We lay out an equivalence between the two, formalizing a simple technique (CFGRL) to improve performance across-the-board when training diffusion policies. arxiv.org/abs/2505.23458

Stare at policy improvement and diffusion guidance, and you may notice a suspicious similarity...

We lay out an equivalence between the two, formalizing a simple technique (CFGRL) to improve performance across-the-board when training diffusion policies.

arxiv.org/abs/2505.23458
Seohong Park (@seohong_park) 's Twitter Profile Photo

We found a way to do RL *only* with BC policies. The idea is simple: 1. Train a BC policy π(a|s) 2. Train a conditional BC policy π(a|s, z) 3. Amplify(!) the difference between π(a|s, z) and π(a|s) using CFG Here, z can be anything (e.g., goals for goal-conditioned RL). 🧵↓

We found a way to do RL *only* with BC policies.

The idea is simple:

1. Train a BC policy π(a|s)
2. Train a conditional BC policy π(a|s, z)
3. Amplify(!) the difference between π(a|s, z) and π(a|s) using CFG

Here, z can be anything (e.g., goals for goal-conditioned RL).

🧵↓
Paul Zhou (@zhiyuan_zhou_) 's Twitter Profile Photo

Traveling to #RSS2025 tomorrow and looking forward to catching up with old friends and meeting new ones! I’ll be presenting AutoEval at the Robot Evaluation Workshop on Wednesday, and honored to receive the oral and the workshop award! Oral: Wed 11AM Poster: Wed 3:30-4:30