Daniel Geng (@dangengdg) 's Twitter Profile
Daniel Geng

@dangengdg

PhD student at @UmichCSE. Interested in computer vision and generative models. Previously @GoogleDeepMind, @MetaAI, @berkeley_ai

ID: 770504727305949184

linkhttp://dangeng.github.io calendar_today30-08-2016 06:13:37

127 Tweet

1,1K Followers

818 Following

Alejandro Pardo (@pardoalejo) 's Twitter Profile Photo

Ever noticed how one scene seamlessly transitions into another in films? That’s a match-cut—a subtle yet powerful cinematic trick. Our MatchDiffusion, generates two videos from text prompts, designed to form a seamless match-cut—effortlessly and training-free. 🎥✨ 1/n

Trenton Chang (@chang_trenton) 's Twitter Profile Photo

(1/) I'll be going to #NeurIPS2024 next week, where I'll be presenting exciting new work on detecting gaming using causal inference! Our work is motivated by some problems in Medicare (U.S. public health insurance): turns out it's incredibly easy to game that system.

Xun Huang (@xunhuang1995) 's Twitter Profile Photo

🚀 Introducing CausVid: Instant video generation that plays the moment you hit "Generate", while maintaining state-of-the-art quality! Project Page: causvid.github.io. More details in the long thread.

Nicolas DUFOUR (@nico_dufour) 's Twitter Profile Photo

🌍 Guessing where an image was taken is a hard, and often ambiguous problem. Introducing diffusion-based geolocation—we predict global locations by refining random guesses into trajectories across the Earth's surface! 🗺️ Paper, code, and demo: nicolas-dufour.github.io/plonk

Daniel Geng (@dangengdg) 's Twitter Profile Photo

I'll be presenting "Images that Sound" today at #NeurIPS2024! East Exhibit Hall A-C #2710. Come say hi to me and Andrew Owens :) (Ziyang Chen sadly could not make it, but will be there in spirit :') )

Daniel Geng (@dangengdg) 's Twitter Profile Photo

I had a lot of fun helping put this problem set together -- if you're teaching diffusion models + computer vision, consider using this homework for your course! (links at end of Ryan Tabrizi's thread!)

Daniel Geng (@dangengdg) 's Twitter Profile Photo

Hey all, I'll be answering questions about our "Motion Prompting" paper on alphaXiv (alphaXiv) (it's like arXiv, but adds a discussion section, and I think is quite well built!): alphaxiv.org/abs/2412.02700…

David McAllister (@davidrmcall) 's Twitter Profile Photo

Decentralized Diffusion Models power stronger models trained on more accessible infrastructure. DDMs mitigate the networking bottleneck that locks training into expensive and power-hungry centralized clusters. They scale gracefully to billions of parameters and generate

Zhaoying Pan (@zhaoyingpan) 's Twitter Profile Photo

Our workshop at #ICLR2025 is now open to submissions until 02/03 🥳 check our website if you are interested: sites.google.com/view/icbinb-20…

Oliver Wang (@oliver_wang2) 's Twitter Profile Photo

A sister team to ours at Google DeepMind is looking for student researchers this summer. Please reach out if you are a PhD student working on media generation (diffusion models), or if you are a professor with students to recommend! 😀

Chris Rockwell (@_crockwell) 's Twitter Profile Photo

Ever wish YouTube had 3D labels? 🚀Introducing🎥DynPose-100K🎥, an Internet-scale collection of diverse videos annotated with camera pose! Applications include camera-controlled video generation🤩and learned dynamic pose estimation😯 Download: huggingface.co/datasets/nvidi…

Jeongsoo Park (@jespark0) 's Twitter Profile Photo

Can AI image detectors keep up with new fakes? Mostly, no. Existing detectors are trained using a handful of models. But there are thousands in the wild! Our work, Community Forensics, uses 4800+ generators to train detectors that generalize to new fakes. #CVPR2025 🧵 (1/5)

Yiming Dou (@_yimingdou) 's Twitter Profile Photo

Ever wondered how a scene sounds👂 when you interact👋 with it? Introducing our #CVPR2025 work "Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes" -- we make 3D scene reconstructions audibly interactive! yimingdou.com/hearing_hands/

Ayush Shrivastava (@ayshrv) 's Twitter Profile Photo

Excited to share our CVPR 2025 paper on cross-modal space-time correspondence! We present a method to match pixels across different modalities (RGB-Depth, RGB-Thermal, Photo-Sketch, and cross-style images) — trained entirely using unpaired data and self-supervision. Our

Excited to share our CVPR 2025 paper on cross-modal space-time correspondence!

We present a method to match pixels across different modalities (RGB-Depth, RGB-Thermal, Photo-Sketch, and cross-style images) — trained entirely using unpaired data and self-supervision.

Our