Junyao Shi (@junyaoshi) 's Twitter Profile
Junyao Shi

@junyaoshi

CS PhD @Penn | Robot Learning
junyaoshi.github.io

ID: 827505440913244160

calendar_today03-02-2017 13:14:06

48 Tweet

247 Takipçi

623 Takip Edilen

Peter Yichen Chen (@peterchencyc) 's Twitter Profile Photo

People say there’s a huge gap between simulation and reality—especially when you run the simulation for a long time. That’s generally true… but we’re excited to share that we’ve taken a solid step toward closing that gap. We can now accurately simulate a robot folding boxes in

jpan (@jenpan_) 's Twitter Profile Photo

Robots need memory to handle complex, multi-step tasks. Can we design an effective method for this? We propose MemER, a hierarchical VLA policy that learns what visual frames to remember across multiple long-horizon tasks, enabling memory-aware manipulation. (1/5)

Mateo Guaman Castro (@mateoguaman) 's Twitter Profile Photo

How can we create a single navigation policy that works for different robots in diverse environments AND can reach navigation goals with high precision? Happy to share our new paper, "VAMOS: A Hierarchical Vision-Language-Action Model for Capability-Modulated and Steerable

Guangqi Jiang (@luccachiang) 's Twitter Profile Photo

Ever want to enjoy all the privileged information in sim while seamlessly transferring to the real world? How can we correct policy mistakes after deployment? 👉Introducing GSWorld, a real2sim2real photo-realistic simulator with interaction physics with fully open-sourced code.

Wenli Xiao (@_wenlixiao) 's Twitter Profile Photo

What if robots could improve themselves by learning from their own failures in the real-world? Introducing 𝗣𝗟𝗗 (𝗣𝗿𝗼𝗯𝗲, 𝗟𝗲𝗮𝗿𝗻, 𝗗𝗶𝘀𝘁𝗶𝗹𝗹) — a recipe that enables Vision-Language-Action (VLA) models to self-improve for high-precision manipulation tasks. PLD

Ken Goldberg (@ken_goldberg) 's Twitter Profile Photo

Looking fwd to speaking and discussions on “Good Old Fashioned Engineering Can Close the 100,000 Year 'Data Gap' in Robotics,” this Tues 3:30pm Penn, then Columbia University on Wed, Boston Dynamics on Thurs, and Massachusetts Institute of Technology (MIT) on Friday: cis.upenn.edu/events/

Generalist (@generalistai_) 's Twitter Profile Photo

Introducing GEN-0, our latest 10B+ foundation model for robots ⏱️ built on Harmonic Reasoning, new architecture that can think & act seamlessly 📈 strong scaling laws: more pretraining & model size = better 🌍 unprecedented corpus of 270,000+ hrs of dexterous data Read more 👇

Ted Xiao (@xiao_ted) 's Twitter Profile Photo

Compelling advances in scaling laws for robotics from Generalist! Scaling laws are without a doubt one of the key components that enabled the rapid hyperscaling of language model pre-training over the past years. Establishing predictive scaling laws would be a watershed moment

Compelling advances in scaling laws for robotics from <a href="/GeneralistAI/">Generalist</a>! Scaling laws are without a doubt one of the key components that enabled the rapid hyperscaling of language model pre-training over the past years. Establishing predictive scaling laws would be a watershed moment
Unitree (@unitreerobotics) 's Twitter Profile Photo

Embodied Avatar: Full-body Teleoperation Platform🥳 Everyone has fantasized about having an embodied avatar! Full-body teleoperation and full-body data acquisition platform is waiting for you to try it out!

Kaifeng Zhang (@kaiwynd) 's Twitter Profile Photo

🧵 Evaluating robot policies in the real world is slow, expensive, and hard to scale. During my internship at SceniX AI this summer, we had many discussions around the two key questions: how accurate must a simulator be for evaluation to be meaningful, and how do we get there?

Eddy Xu (@eddybuild) 's Twitter Profile Photo

today, we’re open sourcing the largest egocentric dataset in history. - 10,000 hours - 2,153 factory workers - 1,080,000,000 frames the era of data scaling in robotics is here. (thread)

Junyao Shi (@junyaoshi) 's Twitter Profile Photo

Sometimes I wonder: isn’t spatial intelligence a byproduct of learning to interact with the world, rather than achieved by generating the world? When I play tennis, I’m not generating dynamically consistent Gaussian splats of the moving ball at 50hz, I learn some implicit