Jeff Li (@jiefengli_jeff) 's Twitter Profile
Jeff Li

@jiefengli_jeff

Research Scientist at @NVIDIA | PhD from SJTU @sjtu1896 | Interested in 3D Computer Vision, Human Digitization | Views are my own

ID: 3035103901

linkhttps://jeffli.site calendar_today21-02-2015 18:29:23

88 Tweet

1,1K Takipçi

689 Takip Edilen

Wenlong Huang (@wenlong_huang) 's Twitter Profile Photo

How to harness foundation models for *generalization in the wild* in robot manipulation? Introducing VoxPoser: use LLM+VLM to label affordances and constraints directly in 3D perceptual space for zero-shot robot manipulation in the real world! 🌐 voxposer.github.io 🧵👇

Michael Black (@michael_j_black) 's Twitter Profile Photo

French bread, red wine, cheese, the Eiffel Tower. And now #ICCV2025 is coming to Paris. All that's missing is you! Come make #ICCV2023 the most diverse ever. If you need financial support to attend check out the DEI page and apply by July 20! iccv2023.thecvf.com/diversity.equi…

Leonard Bruns (@leonard_bruns) 's Twitter Profile Photo

Tracking any point in a video is a fundamental problem in computer vision. The recent @DeepMind paper TAPIR by Carl Doersch et al. significantly improved over prior state-of-the-art. I visualized the main components of their approach using Rerun.

Yuliang Xiu (@yuliangxiu) 's Twitter Profile Photo

Thanks AK for sharing our new work TeCH. Reconstruction is a form of Conditional Generation, especially for one-shot and few-shot occasions. Reconstruct the visible like architect, imagine the invisible like painter. Projct: huangyangyi.github.io/tech/

Jim Fan (@drjimfan) 's Twitter Profile Photo

A neural network can smell like humans do for the first time!👃🏽 Digital smell is a modality that AI community has long ignored, but maybe one day useful for robot chef 👩🏽‍🍳? Here's how to do smell2text: 1. Collected 5,000 molecules and ask humans to label "creamy, chocolate,

A neural network can smell like humans do for the first time!👃🏽

Digital smell is a modality that AI community has long ignored, but maybe one day useful for robot chef 👩🏽‍🍳? Here's how to do smell2text:

1. Collected 5,000 molecules and ask humans to label "creamy, chocolate,
Hao-Shu Fang (@haoshu_fang) 's Twitter Profile Photo

🤖Joint-level control + portability = robot data in the wild! We present AirExo, a low-cost hardware, and showcase how in-the-wild data enhances robot learning, even in contact-rich tasks. A promising tool for large-scale robot learning & TeleOP, now at airexo.github.io!

Jiawei Yang (@jiaweiyang118) 's Twitter Profile Photo

Have you ever seen some artifacts inherent in ViT's feature maps? Wonder why and how to address them? Check out our latest work! See more demos at jiawei-yang.github.io/DenoisingViT/ Paper: arxiv.org/abs/2401.02957 Code: github.com/Jiawei-Yang/De…

Yue Wang (@yuewang314) 's Twitter Profile Photo

Ever wonder why well-trained Vision Transformers still exhibit noises? We introduce Denoising Vision Transformers (DVT), led by amazing Jiawei Yang Katie Luo Jeff Li, and with long-term collaborators Yonglong Tian Kilian Weinberger. Website: jiawei-yang.github.io/DenoisingViT/ Code:

Jeff Li (@jiefengli_jeff) 's Twitter Profile Photo

Wonder why there are artifacts in Vision Transformers and how to address them? Check out our latest work! Website: jiawei-yang.github.io/DenoisingViT/ Code: github.com/Jiawei-Yang/De… Paper: arxiv.org/abs/2401.02957

Davis Rempe (@davrempe) 's Twitter Profile Photo

Check out our recent work led by Mathis Petrovich that generates human motions from a timeline of text prompts, similar to a typical video editor. The method operates entirely at test time, so it works with off-the-shelf motion diffusion models! Project: mathis.petrovich.fr/stmc/

Matthias Niessner (@mattniessner) 's Twitter Profile Photo

Visiting Shenzhen this week and honored to give a keynote at China3DV! Looking forward to an exciting technical program: csig3dv.net If you are around and want catch up or chat about research, just reach out to me :)

Visiting Shenzhen this week and honored to give a keynote at China3DV!

Looking forward to an exciting technical program: csig3dv.net

If you are around and want catch up or chat about research, just reach out to me :)
Jiawei Yang (@jiaweiyang118) 's Twitter Profile Photo

Very excited to get this out: “DVT: Denoising Vision Transformers”. We've identified and combated those annoying positional patterns in many ViTs. Our approach denoises them, achieving SOTA results and stunning visualizations! Learn more on our website: jiawei-yang.github.io/DenoisingViT/

Tsung-Yi Lin (@tsungyilincv) 's Twitter Profile Photo

NVIDIA Graduate Fellowship Program (2025-2026) is now open for applications. Awards are up to $60,000, along with mentor and technical support. The deadline is September 13th, so make sure to apply! research.nvidia.com/graduate-fello…

Pavlo Molchanov (@pavlomolchanov) 's Twitter Profile Photo

🚀 Our team is hiring! Join to Advance Efficiency in Deep Learning at NVIDIA! 🚀 🔗 Apply here: bit.ly/nvdler-job Our team, Deep Learning Efficiency Research (nv-dler.github.io) at NVIDIA Research, is about a year old, and we are expanding. We're looking for

🚀 Our team is hiring! Join to Advance Efficiency in Deep Learning at NVIDIA! 🚀

🔗 Apply here: bit.ly/nvdler-job

Our team, Deep Learning Efficiency Research (nv-dler.github.io) at NVIDIA Research, is about a year old, and we are expanding. We're looking for
Dmytro Mishkin 🇺🇦 (@ducha_aiki) 's Twitter Profile Photo

BLADE: Single-view Body Mesh Learning through Accurate Depth Estimation shengze wang Jiefeng Li, Tianye Li Ye Yuan, Henry Fuchs, Koki Nagano Shalini De Mello Michael Stengel tl;dr: camera intrinsics matter for human mesh estimation,can optimize via rendering arxiv.org/abs/2412.08640

BLADE: Single-view Body Mesh Learning through Accurate Depth Estimation

<a href="/mct1224/">shengze wang</a>  Jiefeng Li, <a href="/_TianyeLi/">Tianye Li</a> Ye Yuan, Henry Fuchs, <a href="/luminohope/">Koki Nagano</a>  <a href="/shalinidemello/">Shalini De Mello</a> <a href="/virtualitaet/">Michael Stengel</a>

tl;dr: camera intrinsics matter for human mesh estimation,can optimize via rendering
arxiv.org/abs/2412.08640
Songyou Peng (@songyoupeng) 's Twitter Profile Photo

Dreaming of very accurate metric depth in stunning 4K resolution at speed? Check out our Prompt Depth Anything! We "prompt" Depth Anything with sparse lidar cues, enabling a wide range of applications! 🔗 Project page with codes and cool visualizations: promptda.github.io

Dimitris Tzionas (@dimtzionas) 's Twitter Profile Photo

📢 I am #hiring 2x #PhD candidates to work on Human-centric #3D #ComputerVision at the University of #Amsterdam! 📢 The positions are funded by an #ERC #StartingGrant. For details and for submitting your application please see: werkenbij.uva.nl/en/vacancies/p… 🆘 Deadline: Feb 16 🆘

Jeff Li (@jiefengli_jeff) 's Twitter Profile Photo

📣📣📣 Excited to share GENMO: A Generalist Model for Human Motion. Words can’t perfectly describe human motion—so we build GENMO. It’s everything to motion. 🔥Video, Text, Music, Audio, Keyframes, Spatial Control…🔥 -- GENMO handles it all within a single model. 📹 Two