Tsung-Yi Lin (@tsungyilincv) Twitter Tweets • TwiCopy

Tsung-Yi Lin

8 months ago

Future frames light the path to smarter actions! 🚀🤖 CoT-VLA leverages visual chain-of-thought reasoning to unlock large-scale video data and guide goal-driven robotics. #CVPR2025 #AI #Robotics

thumb_up_off_alt21

chat_bubble_outline0

repeat2

shareShare

Excited to be presenting our new work–HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation– at #CVPR2025 this week. VAR (Visual Autoregressive Modelling) introduced a very nice way to formulate autoregressive image generation as a next-scale prediction task (from

thumb_up_off_alt49

chat_bubble_outline1

repeat21

shareShare

Max Zhaoshuo Li 李赵硕

@mli0603

6 months ago

Cosmos-Reason1 has exciting updates 💡 Now it understands physical reality — judging videos as real or fake! Check out the resources👇 Paper: arxiv.org/abs/2503.15558 Huggingface: huggingface.co/nvidia/Cosmos-… Code: github.com/nvidia-cosmos/… Project page: research.nvidia.com/labs/dir/cosmo… (1/n)

thumb_up_off_alt100

chat_bubble_outline2

repeat32

shareShare

Fangyin Wei

@fangyinwei

6 months ago

Join us on the 1st workshop on Vision Meets Physics: Synergizing Physical Simulation and Computer Vision at #CVPR2025 tomorrow! Thought-provoking talks and expert insights from leading researchers that YOU CANNOT MISS! 📍104A ⏰ 8:45am June 12th visionmeetphysics.github.io

thumb_up_off_alt17

chat_bubble_outline1

repeat5

shareShare

Qinsheng Zhang

@qsh_zh

6 months ago

🚀 Introducing Cosmos-Predict2! Our most powerful open video foundation model for Physical AI. Cosmos-Predict2 significantly improves upon Predict1 in visual quality, prompt alignment, and motion dynamics—outperforming popular open-source video foundation models. It’s openly

thumb_up_off_alt203

chat_bubble_outline7

repeat61

shareShare

Tsung-Yi Lin

@tsungyilincv

6 months ago

The physics meets vision workshop just started! Come joining us!

thumb_up_off_alt31

chat_bubble_outline0

repeat4

shareShare

Tsung-Yi Lin

@tsungyilincv

6 months ago

Generating 3D models with parts is a key step toward scalable, interactive simulation environments. Check out our work — PartPacker — and the concurrent project, PartCrafter!" PartPacker: github.com/NVlabs/PartPac… PartCrafter: wgsxm.github.io/projects/partc…

thumb_up_off_alt72

chat_bubble_outline2

repeat14

shareShare

Victor M

@victormustar

6 months ago

Nvidia cooked with PartPacker 3D Generation A new method to create 3D objects from a single image, with each part separate and easy to edit 🔥 ⬇️ Demo available on Hugging Face

thumb_up_off_alt601

chat_bubble_outline5

repeat85

shareShare

Satya Mallick

@learnopencv

6 months ago

NVIDIA’s Cosmos Reason1 is a family of Vision Language Models trained to understand the physical world and make decisions for embodied reasoning. What makes Cosmos Reason1, as a promising contender for video understanding and embodied reasoning is mainly attributed to its dataset

thumb_up_off_alt8

chat_bubble_outline0

repeat5

shareShare

Hanzi Mao

@hanna_mao

5 months ago

We build Cosmos-Predict2 as a world foundation model for Physical AI builders — fully open and adaptable. Post-train it for specialized tasks or different output types. Available in multiple sizes, resolutions, and frame rates. 📷 Watch the repo walkthrough

thumb_up_off_alt281

chat_bubble_outline8

repeat70

shareShare

NVIDIA Robotics

@nvidiarobotics

4 months ago

Facing data bottlenecks in your robotics workflows? Explore how #NVIDIACosmos world foundation models from #NVIDIAResearch can be post trained for specific #PhysicalAI applications: 🔮 Cosmos Predict to simulate future scenarios. 🎨 Cosmos Transfer to create diverse synthetic

thumb_up_off_alt49

chat_bubble_outline2

repeat10

shareShare

Tsung-Yi Lin

@tsungyilincv

3 months ago

🚀Earlier this year we launched Cosmos-Reason1 — and it just climbed to #1 on the new Physical Reasoning Leaderboard, released alongside V-JEPA 2! 🤗Try it out: huggingface.co/nvidia/Cosmos-…

thumb_up_off_alt14

chat_bubble_outline1

repeat2

shareShare

Tsung-Yi Lin

@tsungyilincv

2 months ago

Training Physical AI agents depends on rich environments. Simulating diverse worlds is key to speeding up progress—excited to see @moonlake pushing this forward!

thumb_up_off_alt13

chat_bubble_outline2

repeat1

shareShare

NVIDIA AI Developer

@nvidiaaidev

a month ago

NVIDIA Cosmos open models made major progress.✨ ✅ Cosmos Predict 2.5 unifies text, image, and video world generation into one model that creates longer and more coherent simulations with improved grounding and efficiency. ✅ Cosmos Transfer 2.5 introduces precise, spatially

thumb_up_off_alt185

chat_bubble_outline11

repeat35

shareShare

Marco Pavone

@drmapavone

a month ago

Excited to unveil NVIDIA's latest work on #Reasoning Vision–Language–Action (#VLA) models — Alpamayo-R1! Alpamayo-R1 is a new #reasoning VLA architecture featuring a diffusion-based action expert built on top of the #Cosmos-#Reason backbone. It represents one of the core

thumb_up_off_alt235

chat_bubble_outline10

repeat40

shareShare

Max Zhaoshuo Li 李赵硕

@mli0603

10 days ago

This is a really smart setup for evaluating forward and inverse world modeling with VLMs💡— congrats on the paper! I also really appreciate the deep dive into Cosmos-Reason1. Lots of insightful details to learn from 📖

thumb_up_off_alt8

chat_bubble_outline0

repeat4

shareShare

Tsung-Yi Lin

Tsung-Yi Lin

Hermann

Max Zhaoshuo Li 李赵硕

Fangyin Wei

Qinsheng Zhang

Tsung-Yi Lin

Tsung-Yi Lin

Victor M

Satya Mallick

Hanzi Mao

NVIDIA Robotics

Tsung-Yi Lin

Tsung-Yi Lin

NVIDIA AI Developer

Marco Pavone

Max Zhaoshuo Li 李赵硕