
Jeff Li
@jiefengli_jeff
Research Scientist at @NVIDIA | PhD from SJTU @sjtu1896 | Interested in 3D Computer Vision, Human Digitization | Views are my own
ID: 3035103901
https://jeffli.site 21-02-2015 18:29:23
88 Tweet
1,1K Followers
689 Following







Ever wonder why well-trained Vision Transformers still exhibit noises? We introduce Denoising Vision Transformers (DVT), led by amazing Jiawei Yang Katie Luo Jeff Li, and with long-term collaborators Yonglong Tian Kilian Weinberger. Website: jiawei-yang.github.io/DenoisingViT/ Code:


Check out our recent work led by Mathis Petrovich that generates human motions from a timeline of text prompts, similar to a typical video editor. The method operates entirely at test time, so it works with off-the-shelf motion diffusion models! Project: mathis.petrovich.fr/stmc/




Very excited to get this out: โDVT: Denoising Vision Transformersโ. We've identified and combated those annoying positional patterns in many ViTs. Our approach denoises them, achieving SOTA results and stunning visualizations! Learn more on our website: jiawei-yang.github.io/DenoisingViT/



BLADE: Single-view Body Mesh Learning through Accurate Depth Estimation shengze wang Jiefeng Li, Tianye Li Ye Yuan, Henry Fuchs, Koki Nagano Shalini De Mello Michael Stengel tl;dr: camera intrinsics matter for human mesh estimation,can optimize via rendering arxiv.org/abs/2412.08640



๐ข I am #hiring 2x #PhD candidates to work on Human-centric #3D #ComputerVision at the University of #Amsterdam! ๐ข The positions are funded by an #ERC #StartingGrant. For details and for submitting your application please see: werkenbij.uva.nl/en/vacancies/pโฆ ๐ Deadline: Feb 16 ๐

๐ฃ๐ฃ๐ฃ Excited to share GENMO: A Generalist Model for Human Motion. Words canโt perfectly describe human motionโso we build GENMO. Itโs everything to motion. ๐ฅVideo, Text, Music, Audio, Keyframes, Spatial Controlโฆ๐ฅ -- GENMO handles it all within a single model. ๐น Two