Joel Jang (@jang_yoel) 's Twitter Profile
Joel Jang

@jang_yoel

Research Scientist @nvidiaai GEAR Lab. CS PhD student @uwcse.

ID: 1370193064887681029

linkhttps://joeljang.github.io/ calendar_today12-03-2021 02:01:04

324 Tweet

1,1K Takipçi

366 Takip Edilen

Soroush Nasiriany (@snasiriany) 's Twitter Profile Photo

It’s not a matter of if, it’s a matter of when, video models and world models are going to be a central tool for building robot foundation models.

The Humanoid Hub (@thehumanoidhub) 's Twitter Profile Photo

NVIDIA has published a paper on DREAMGEN – a powerful 4-step pipeline for generating synthetic data for humanoids that enables task and environment generalization. - Step 1: Fine-tune a video generation model using a small number of human teleoperation videos - Step 2: Prompt

Brett Adcock (@adcock_brett) 's Twitter Profile Photo

Nvidia also announced DreamGen, a new engine that scales robot learning with digital dreams It produces large volumes of photorealistic robot videos (using video models) paired with motor action labels and unlocks generalization to new environments

Ruijie Zheng (@ruijie_zheng12) 's Twitter Profile Photo

Representation also matters for VLA models! Introducing FLARE: Robot Learning with Implicit World Modeling. With future latent alignment objective, FLARE significantly improves policy performance on multitask imitation learning & unlocks learning from egocentric human videos.

Representation also matters for VLA models! Introducing FLARE: Robot Learning with Implicit World Modeling. With future latent alignment objective, FLARE significantly improves policy performance on multitask imitation learning & unlocks learning from egocentric human videos.
Joel Jang (@jang_yoel) 's Twitter Profile Photo

Giving a talk about GR00T N1, GR00T N1.5, and GR00T Dreams in NVIDIA GTC Paris 06.11 2PM - 2:45PM CEST. If you are at Vivatech in Paris, please stop by the "An Introduction to Humanoid Robotics" Session!

Yiyang Zhou (@aiyiyangz) 's Twitter Profile Photo

🔥 ReAgent-V Released! 🔥 A unified video framework with reflection and reward-driven optimization. ✨ Real-time self-correction. ✨ Triple-view reflection. ✨ Auto-selects high-reward samples for training.

🔥 ReAgent-V Released! 🔥

A unified video framework with reflection and reward-driven optimization.

✨ Real-time self-correction.
✨ Triple-view reflection.
✨ Auto-selects high-reward samples for training.
Chris Paxton (@chris_j_paxton) 's Twitter Profile Photo

Assuming that we need ~2 trillion tokens to get to a robot GPT, how can we get there? I went through a few scenarios looking at how we can combine simulation data, human video data, and looking at the size of existing robot fleets. Some assumptions: - We probably need some real

Assuming that we need ~2 trillion tokens to get to a robot GPT, how can we get there? I went through a few scenarios looking at how we can combine simulation data, human video data, and looking at the size of existing robot fleets.

Some assumptions:
- We probably need some real
Qinsheng Zhang (@qsh_zh) 's Twitter Profile Photo

🚀 Introducing Cosmos-Predict2! Our most powerful open video foundation model for Physical AI. Cosmos-Predict2 significantly improves upon Predict1 in visual quality, prompt alignment, and motion dynamics—outperforming popular open-source video foundation models. It’s openly

youliang (@youliangtan) 's Twitter Profile Photo

How we improve VLA generalization? 🤔 Last week we upgraded #NVIDIA GR00T N1.5 with minor VLM tweaks, FLARE, and richer data mixtures (DreamGen, etc.) ✨. N1.5 yields better language following — post-trained on unseen Unitree G1 with 1K trajectories, it follows commands on

Joel Jang (@jang_yoel) 's Twitter Profile Photo

🚀 GR00T Dreams code is live! NVIDIA GEAR Lab's open-source solution for robotics data via video world models. Fine-tune on any robot, generate 'dreams', extract actions with IDM, and train visuomotor policies with LeRobot datasets (GR00T N1.5, SmolVLA). github.com/NVIDIA/GR00T-D…

Joel Jang (@jang_yoel) 's Twitter Profile Photo

Check out Cosmos-Predict2, a new SOTA video world model trained specifically for Physical AI (powering GR00T Dreams & DreamGen)!

AgiBot World (@agibotworld) 's Twitter Profile Photo

Compete for a $560,000 Prize Pool at IROS 2025 AgiBot World Challenge! 💰 The AgiBot World Challenge – Manipulation Track is LIVE! Hosted by @AgiBot and OpenDriveLab at #IROS2025. 🚀 Challenge: Tackle 10 complex Sim2Real manipulation tasks. 🛠️ Resources: Access a unique

Compete for a $560,000 Prize Pool at IROS 2025 AgiBot World Challenge! 💰
The AgiBot World Challenge – Manipulation Track is LIVE! Hosted by @AgiBot and <a href="/OpenDriveLab/">OpenDriveLab</a> at #IROS2025.
🚀 Challenge: Tackle 10 complex Sim2Real manipulation tasks.
🛠️ Resources: Access a unique
Jim Fan (@drjimfan) 's Twitter Profile Photo

I've been a bit quiet on X recently. The past year has been a transformational experience. Grok-4 and Kimi K2 are awesome, but the world of robotics is a wondrous wild west. It feels like NLP in 2018 when GPT-1 was published, along with BERT and a thousand other flowers that

I've been a bit quiet on X recently. The past year has been a transformational experience. Grok-4 and Kimi K2 are awesome, but the world of robotics is a wondrous wild west. It feels like NLP in 2018 when GPT-1 was published, along with BERT and a thousand other flowers that
The Humanoid Hub (@thehumanoidhub) 's Twitter Profile Photo

A humanoid robot policy trained solely on synthetic data generated by a world model. Research Scientist Joel Jang presents NVIDIA's DreamGen pipeline: ⦿ Post-train the world model Cosmos-Predict2 with a small set of real teleoperation demos. ⦿ Prompt the world model to

Jim Fan (@drjimfan) 's Twitter Profile Photo

World modeling for robotics is incredibly hard because (1) control of humanoid robots & 5-finger hands is wayyy harder than ⬆️⬅️⬇️➡️ in games (Genie 3); and (2) object interaction is much more diverse than FSD, which needs to *avoid* coming into contact. Our GR00T Dreams work was