Deva Ramanan (@ramanandeva) 's Twitter Profile
Deva Ramanan

@ramanandeva

Professor at Carnegie Mellon University

ID: 1640797737238294528

linkhttps://www.cs.cmu.edu/~deva/ calendar_today28-03-2023 19:27:39

9 Tweet

290 Followers

3 Following

Zhiqiu Lin (@zhiqiulin) 's Twitter Profile Photo

In text-to-image generation, evaluating how well the generated image matches the prompt is a major challenge. We address this with VQAScore: a SOTA metric that significantly surpasses CLIPScore, PickScore, ImageReward, TIFA, and more! VQAScore works especially well on complex

In text-to-image generation, evaluating how well the generated image matches the prompt is a major challenge. We address this with VQAScore: a SOTA metric that significantly surpasses CLIPScore, PickScore, ImageReward, TIFA, and more! 

VQAScore works especially well on complex
Tarasha Khurana (@tarashakhurana) 's Twitter Profile Photo

Our new work explores generating future observations (blue) given the past (gray), by leveraging large-scale pretraining of image diffusion models for video prediction, and conditioning on timestamps & invariant data modalities. w/ Deva Ramanan page: cs.cmu.edu/~tkhurana/dept…

Anirudh Chakravarthy (@anirudhchak) 's Twitter Profile Photo

Lidar Panoptic Segmentation (LPS) is crucial for the safe deployment of autonomous vehicles, but it fails to consider realistic open-world environments. Our IJCV paper introduces LPS in the Open World (LiPSOW) to discover novel classes from the open-world! (1/5)

Kangle Deng (@kangle_deng) 's Twitter Profile Photo

[1/2] 📢 I'll present "FlashTex: Fast Relightable Mesh Texturing with LightControlNet" at #ECCV2024’s Oral Session tomorrow! 🎉 Join me to explore how our generated textures can be properly relit in various lighting environments. ⚡ 📅 Oral: Tue, Oct 1st, 2 PM 📍 Poster: #159

Zhiqiu Lin (@zhiqiulin) 's Twitter Profile Photo

Sharing exciting news from Milan 🇮🇹: VQAScore (ECCV’24) was highlighted as the strongest text-to-image metric in DeepMind’s Imagen3 tech report! Imagen3 also used our GenAI-Bench (CVPR’24 SynData Best Short Paper) to evaluate compositional text-to-image generation. Catch our

Sharing exciting news from Milan 🇮🇹: VQAScore (ECCV’24) was highlighted as the strongest text-to-image metric in DeepMind’s Imagen3 tech report! Imagen3 also used our GenAI-Bench (CVPR’24 SynData Best Short Paper) to evaluate compositional text-to-image generation. Catch our
Zhiqiu Lin (@zhiqiulin) 's Twitter Profile Photo

🚀 Make Vision Matter in Visual-Question-Answering (VQA)! Introducing NaturalBench, a vision-centric VQA benchmark (NeurIPS'24) that challenges vision-language models with pairs of simple questions about natural imagery. 🌍📸 Here’s what we found after testing 53 models

🚀 Make Vision Matter in Visual-Question-Answering (VQA)!

Introducing NaturalBench, a vision-centric VQA benchmark (NeurIPS'24) that challenges vision-language models with pairs of simple questions about natural imagery. 🌍📸

Here’s what we found after testing 53 models
Chancharik Mitra (@chancharikm) 's Twitter Profile Photo

🎯 Introducing Sparse Attention Vectors (SAVs): A breakthrough method for extracting powerful multimodal features from Large Multimodal Models (LMMs). SAVs enable SOTA performance on discriminative vision-language tasks (classification, safety alignment, etc.)! Links in replies!

Khiem Vuong (@kvuongdev) 's Twitter Profile Photo

[1/6] Recent models like DUSt3R generalize well across viewpoints, but performance drops on aerial-ground pairs. At #CVPR2025, we propose AerialMegaDepth (aerial-megadepth.github.io), a hybrid dataset combining mesh renderings with real ground images (MegaDepth) to bridge this gap.

Zhiqiu Lin (@zhiqiulin) 's Twitter Profile Photo

Fresh GPT‑o3 results on our vision‑centric #NaturalBench (NeurIPS’24) benchmark! 🎯 Its new visual chain‑of‑thought—by “zooming in” on details—cracks questions that still stump GPT‑4o. Yet vision reasoning isn’t solved: o3 can still hallucinate even after a full minute of

Fresh GPT‑o3 results on our vision‑centric #NaturalBench (NeurIPS’24) benchmark! 🎯 Its new visual chain‑of‑thought—by “zooming in” on details—cracks questions that still stump GPT‑4o.

Yet vision reasoning isn’t solved: o3 can still hallucinate even after a full minute of
Zhiqiu Lin (@zhiqiulin) 's Twitter Profile Photo

📷 Can AI understand camera motion like a cinematographer? Meet CameraBench: a large-scale, expert-annotated dataset for understanding camera motion geometry (e.g., trajectories) and semantics (e.g., scene contexts) in any video – films, games, drone shots, vlogs, etc. Links

Nikhil Keetha (@nik__v__) 's Twitter Profile Photo

Meet MapAnything – a transformer that directly regresses factored metric 3D scene geometry (from images, calibration, poses, or depth) in an end-to-end way. No pipelines, no extra stages. Just 3D geometry & cameras, straight from any type of input, delivering new state-of-the-art

Zhiqiu Lin (@zhiqiulin) 's Twitter Profile Photo

🎉CameraBench has been accepted as a Spotlight (3%) @ NeurIPS 2025. Huge congrats to all collaborators at CMU, MIT-IBM, UMass, Harvard, and Adobe. CameraBench is a large-scale effort that pushes video-language models to reason about the language of camera motion just like

Tarasha Khurana (@tarashakhurana) 's Twitter Profile Photo

CogNVS was accepted to NeurIPS Conference 2025! 🎉We are releasing the code today for you all to try: 🆕Code: github.com/Kaihua-Chen/co… Paper: arxiv.org/pdf/2507.12646 With CogNVS, we reformulate dynamic novel-view synthesis as a structured inpainting task: (1) we reconstruct input

Kangle Deng (@kangle_deng) 's Twitter Profile Photo

🏆 Excited to share that BrickGPT (avalovelace1.github.io/BrickGPT/) received the ICCV Best Paper Award! Our first author, Ava Pun, will present it from 1:30 to 1:45 p.m. today in Exhibit Hall III. Huge thanks to all the co-authors Ruixuan Liu Deva Ramanan Changliu Liu Jun-Yan Zhu

Kangle Deng (@kangle_deng) 's Twitter Profile Photo

Ava Pun Ruixuan Liu Deva Ramanan Changliu Liu Jun-Yan Zhu If you miss the talk or want to dive deeper, please also check out our poster and our interview! Poster Session Details: - Location: Exhibit Hall I #306 - Time: Wed 22 Oct 2:45 p.m. HST — 4:45 p.m Read the interview in ICCV Daily: rsipvision.com/ICCV2025-Wedne…

<a href="/AvaLovelace0/">Ava Pun</a> <a href="/RuixuanLiu_/">Ruixuan Liu</a> <a href="/RamananDeva/">Deva Ramanan</a> <a href="/ChangliuL/">Changliu Liu</a> <a href="/junyanz89/">Jun-Yan Zhu</a> If you miss the talk or want to dive deeper, please also check out our poster and our interview!

Poster Session Details:
- Location: Exhibit Hall I #306
- Time: Wed 22 Oct 2:45 p.m. HST — 4:45 p.m

Read the interview in ICCV Daily:
rsipvision.com/ICCV2025-Wedne…