Zhecheng Yuan (@fancy_yzc) 's Twitter Profile
Zhecheng Yuan

@fancy_yzc

PhD @Tsinghua University, IIIS. Interested in reinforcement learning, representation learning, robotics.

ID: 1419249161816317954

linkhttps://gemcollector.github.io/ calendar_today25-07-2021 10:52:31

98 Tweet

431 Followers

475 Following

Tianming Wei (@still_wtm) 's Twitter Profile Photo

In weekly meeting, our advisor Huazhe kept asking: "What's the real point of simulation for manipulation?" HERMES offers a potential answer. Our framework converts diverse human motion data into real robot behaviors via sim training, all without task-specific reward designing.

Kun Lei (@kunlei15) 's Twitter Profile Photo

Very impressive!Focused sim2real mobile dexterous manipulation,human data is leveraged for the guidance in reinforcement learning.

Jason Ma (@jasonma2020) 's Twitter Profile Photo

We just did World’s first on-stage autonomous demo of long-horizon dexterous VLA 🚨 No training. No setup. Performance out of the box. Live demo is hard and unpredictable, but we felt great about our model’s generalization, and it went pretty well! 💯 Zero-shot. 100% success.

Zhecheng Yuan (@fancy_yzc) 's Twitter Profile Photo

Nice work! Our recent project HERMES also leverages a similar pipeline for bimanual dexterous manipulation — using a unified reward and egocentric depth-based sim2real to deploy in diverse in-the-wild scenarios for a variety of tasks. gemcollector.github.io/HERMES/

Zhen Wu (@zhenkirito123) 's Twitter Profile Photo

Our grand finale: A complex, long-horizon dynamic sequence, all driven by a proprioceptive-only policy (no vision/LIDAR)! In this task, the robot carries a chair to a platform, uses it as a step to climb up, then leaps off and performs a parkour-style roll to absorb the landing.

Qianzhong Chen (@qianzhongchen) 's Twitter Profile Photo

🚀 Introducing SARM: Stage-Aware Reward Modeling for Long-Horizon Robot Manipulation Robots struggle with tasks like folding a crumpled T-shirt—long, contact-rich, and hard to label. We propose a scalable reward modeling framework to fix that. 1/n

Kevin Zakka (@kevin_zakka) 's Twitter Profile Photo

We open-sourced the full pipeline! Data conversion from MimicKit, training recipe, pretrained checkpoint, and deployment instructions. Train your own spin kick with mjlab: github.com/mujocolab/g1_s…

Saining Xie (@sainingxie) 's Twitter Profile Photo

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right.

today, we introduce Representation Autoencoders (RAE).

>> Retire VAEs. Use RAEs. 👇(1/n)
Zhecheng Yuan (@fancy_yzc) 's Twitter Profile Photo

Imitation learning provides a solid prior, and Online learning further refines the policy for better performance. In the lab, I’ve been watching Kun’s policy grow stronger over time.😆

Galaxea Dynamics (@galaxeadynamics) 's Twitter Profile Photo

Introducing Huazhe Harry Xu HERMES, a unified human-to-robot learning framework built on the Galaxea mobile base and A1 dual-arm platform. With high-fidelity simulation and dexterous hands, HERMES enables robust sim2real transfer for complex mobile manipulation tasks. #Robotics

Guangqi Jiang (@luccachiang) 's Twitter Profile Photo

Ever want to enjoy all the privileged information in sim while seamlessly transferring to the real world? How can we correct policy mistakes after deployment? 👉Introducing GSWorld, a real2sim2real photo-realistic simulator with interaction physics with fully open-sourced code.

Yongyuan Liang (@cheryyun_l) 's Twitter Profile Photo

Unified multimodal models can generate text and images, but can they truly reason across modalities? 🎨 Introducing ROVER, the first benchmark that evaluates reciprocal cross-modal reasoning in unified models, the next frontier of omnimodal intelligence. 🌐 Project:

Bingyi Kang (@bingyikang) 's Twitter Profile Photo

After a year of team work, we're thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3

Elgce (@benqingwei) 's Twitter Profile Photo

Introducing Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains 🤖 Project page: gallantloco.github.io Arxiv: arxiv.org/abs/2511.14625 Gallant is, to our knowledge, the first system to run a single policy that handles full-space

Tony Z. Zhao (@tonyzzhao) 's Twitter Profile Photo

Today, we present a step-change in robotic AI Sunday. Introducing ACT-1: A frontier robot foundation model trained on zero robot data. - Ultra long-horizon tasks - Zero-shot generalization - Advanced dexterity 🧵->