Ming-Yu Liu (@liu_mingyu) Twitter Tweets • TwiCopy

AK

@_akhaliq

8 months ago

Nvidia just dropped Describe Anything on Hugging Face Detailed Localized Image and Video Captioning

thumb_up_off_alt905

chat_bubble_outline7

repeat154

shareShare

We released Cosmos-Reason1 code, model, and part of the data! We also updated our paper to include a section about our RL infra: arxiv.org/abs/2503.15558 - Code: github.com/nvidia-cosmos/… - Model and Data: huggingface.co/collections/nv… - Blog: developer.nvidia.com/blog/curating-…

thumb_up_off_alt113

chat_bubble_outline0

repeat18

shareShare

Kaiwen Zheng

@zkwthu

7 months ago

1/💡New paper from NVIDIA&Tsinghua International Conference on Minority Languages Spotlight! Direct Discriminative Optimization (DDO) enables GAN-style finetuning of diffusion/autoregressive models without extra networks. SOTA achieved on ImageNet-512! Website: research.nvidia.com/labs/dir/ddo/ Code: github.com/NVlabs/DDO

1/💡New paper from NVIDIA&Tsinghua <a href="/ICML2025/">International Conference on Minority Languages</a> Spotlight!
Direct Discriminative Optimization (DDO) enables GAN-style finetuning of diffusion/autoregressive models without extra networks. SOTA achieved on ImageNet-512!
Website: research.nvidia.com/labs/dir/ddo/
Code: github.com/NVlabs/DDO

thumb_up_off_alt52

chat_bubble_outline3

repeat14

shareShare

Ming-Yu Liu

@liu_mingyu

7 months ago

Check out our new work on Direct Discriminative Optimization improving GenAI models.

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

Max Zhaoshuo Li 李赵硕

@mli0603

6 months ago

Cosmos-Reason1 has exciting updates 💡 Now it understands physical reality — judging videos as real or fake! Check out the resources👇 Paper: arxiv.org/abs/2503.15558 Huggingface: huggingface.co/nvidia/Cosmos-… Code: github.com/nvidia-cosmos/… Project page: research.nvidia.com/labs/dir/cosmo… (1/n)

thumb_up_off_alt100

chat_bubble_outline2

repeat32

shareShare

Ming-Yu Liu

@liu_mingyu

6 months ago

We post-trained a reasoning model to reason whether a video is real or generated. It might be very useful as a critic to improve video generators. Take a look. NVIDIA AI

thumb_up_off_alt36

chat_bubble_outline0

repeat4

shareShare

Ming-Yu Liu

@liu_mingyu

6 months ago

For people looking for a diffusion-based video generator to finetune or post-train for their downstream physical AI applications, we just released our latest one. We have 2 models: 2B and 14B. 2B for fast prototyping and 14B for better quality. The license is fully open. Give it

thumb_up_off_alt46

chat_bubble_outline2

repeat11

shareShare

kiui

@ashawkey3

6 months ago

Happy to share our work PartPacker: We enable one-shot image-to-3D generation with any number of parts! Project page: research.nvidia.com/labs/dir/partp… Demo: huggingface.co/spaces/nvidia/… Code: github.com/NVlabs/PartPac…

thumb_up_off_alt73

chat_bubble_outline0

repeat16

shareShare

Tsung-Yi Lin

@tsungyilincv

6 months ago

Generating 3D models with parts is a key step toward scalable, interactive simulation environments. Check out our work — PartPacker — and the concurrent project, PartCrafter!" PartPacker: github.com/NVlabs/PartPac… PartCrafter: wgsxm.github.io/projects/partc…

thumb_up_off_alt72

chat_bubble_outline2

repeat14

shareShare

Ming-Yu Liu

@liu_mingyu

6 months ago

3D asset generation has advanced a lot in the past few years. Generating a holistic 3D asset is no longer a challenging problem. What's next for 3D generation? We believe that generating a 3D asset with individual parts defined is the next frontier. With the parts, we can start

thumb_up_off_alt22

chat_bubble_outline0

repeat1

shareShare

Ming-Yu Liu

@liu_mingyu

6 months ago

Check out our latest HF demo on 3D generation with part annotation.

thumb_up_off_alt6

chat_bubble_outline1

repeat0

shareShare

Ming-Yu Liu

@liu_mingyu

6 months ago

Big congrats to Eric Jang and the team on the 1X World Model release. Verification is an important part of producing production AI model. Given the diverse nature of the work environment, it makes a lot of sense to leverage a world model to help with policy evaluation.

thumb_up_off_alt13

chat_bubble_outline1

repeat0

shareShare

Hanzi Mao

@hanna_mao

5 months ago

We build Cosmos-Predict2 as a world foundation model for Physical AI builders — fully open and adaptable. Post-train it for specialized tasks or different output types. Available in multiple sizes, resolutions, and frame rates. 📷 Watch the repo walkthrough

thumb_up_off_alt281

chat_bubble_outline8

repeat70

shareShare

Sean Kirmani

@seankirmani

5 months ago

🤖🌎 We are organizing a workshop on Robotics World Modeling at Conference on Robot Learning 2025! We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline. Website: robot-world-modeling.github.io

🤖🌎 We are organizing a workshop on Robotics World Modeling at <a href="/corl_conf/">Conference on Robot Learning</a> 2025!

We have an excellent group of speakers and panelists, and are inviting you to submit your papers with a July 13 deadline.

Website: robot-world-modeling.github.io

thumb_up_off_alt130

chat_bubble_outline3

repeat36

shareShare

Daniel Ho

@itsdanielho

5 months ago

We at 1X with Jack Monas are excited to announce the ICCV phase of our 1X World Model Challenge: huggingface.co/spaces/1x-tech… Participate in the Compression and Sampling tracks for a $8k prize pool & train generative models for cool robot results like: 1x.tech/discover/redwo…

thumb_up_off_alt77

chat_bubble_outline5

repeat5

shareShare

Ming-Yu Liu

@liu_mingyu

4 months ago

Together with Aaron Lefohn and Sanja Fidler, we will give a special address at SIGGRAPH. Specifically, I will give an update on our vision and our current work in enabling Physical AI. Please join us. nvidia.com/en-us/events/s…

thumb_up_off_alt40

chat_bubble_outline0

repeat3

shareShare

NVIDIA Omniverse

@nvidiaomniverse

4 months ago

Kick off your #OpenUSD Day with a look into the future of robotics and autonomous vehicles. 🤖 Join Ming-Yu Liu as he shares how #NVIDIACosmos world foundation models unlock prediction and reasoning for the next wave of robotics and autonomous vehicles. 📅Wednesday, 8/13 at

thumb_up_off_alt28

chat_bubble_outline1

repeat8

shareShare

Ming-Yu Liu

@liu_mingyu

4 months ago

In Cosmos, we are hiring Cosmos World Foundation Model builders. If you are interestd in building large-scale video foundaiton model and multimodal LLM for Robots and cars, please send your CV to [email protected] If you have experiences in large-scale diffusion models,

thumb_up_off_alt52

chat_bubble_outline0

repeat3

shareShare

Ming-Yu Liu

@liu_mingyu

4 months ago

The submissions portal for the NVIDIA 2026-2027 Graduate Fellowships is now open research.nvidia.com/graduate-fello… PHD students work on AI. Please apply!

thumb_up_off_alt18

chat_bubble_outline0

repeat2

shareShare

Jiahui Huang

@huangjh_hjh

4 months ago

[1/N] 🎥 We've made available a powerful spatial AI tool named ViPE: Video Pose Engine, to recover camera motion, intrinsics, and dense metric depth from casual videos! Running at 3–5 FPS, ViPE handles cinematic shots, dashcams, and even 360° panoramas. 🔗 research.nvidia.com/labs/toronto-a…

thumb_up_off_alt414

chat_bubble_outline10

repeat89

shareShare

Ming-Yu Liu

AK

Yin Cui

Kaiwen Zheng

Ming-Yu Liu

Max Zhaoshuo Li 李赵硕

Ming-Yu Liu

Ming-Yu Liu

kiui

Tsung-Yi Lin

Ming-Yu Liu

Ming-Yu Liu

Ming-Yu Liu

Hanzi Mao

Sean Kirmani

Daniel Ho

Ming-Yu Liu

NVIDIA Omniverse

Ming-Yu Liu

Ming-Yu Liu

Jiahui Huang