Jacob Yeung (@jacobyeung) Twitter Tweets • TwiCopy

Jacob Yeung

@jacobyeung

+ Follow

ID: 1450257288

calendar_today23-05-2013 01:32:22

0 Tweet

2 Followers

17 Following

Gabriel Sarch

@gabrielsarch

7 months ago

How can we get VLMs to move their eyes—and reason step-by-step in visually grounded ways? 👀 We introduce ViGoRL, a RL method that anchors reasoning to image regions. 🎯 It outperforms vanilla GRPO and SFT across grounding, spatial tasks, and visual search (86.4% on V*). 👇🧵

thumb_up_off_alt420

chat_bubble_outline11

repeat57

shareShare

Elliott / Shangzhe Wu

@elliottszwu

6 months ago

This was a really fun and exciting workshop #CVPR2025! Huge thanks to all the speakers, organizers and reviewers #CVPR2026! We hope to be able to release the video recordings soon!

This was a really fun and exciting workshop #CVPR2025! Huge thanks to all the speakers, organizers and reviewers <a href="/CVPR/">#CVPR2026</a>!

We hope to be able to release the video recordings soon!

thumb_up_off_alt48

chat_bubble_outline0

repeat6

shareShare

Jennifer Hsia

@jen_hsia

5 months ago

1/6 Retrieval is supposed to improve generation in RAG systems. But in practice, adding more documents can hurt performance, even when relevant ones are retrieved. We introduce RAGGED, a framework to measure and diagnose when retrieval helps and when it hurts.

thumb_up_off_alt107

chat_bubble_outline1

repeat21

shareShare

Tarasha Khurana

@tarashakhurana

5 months ago

Excited to share recent work with Kaihua Chen and Deva Ramanan where we learn to do novel view synthesis for dynamic scenes in a self-supervised manner, only from 2D videos! webpage: cog-nvs.github.io arxiv: arxiv.org/abs/2507.12646 code (soon): github.com/Kaihua-Chen/co…

thumb_up_off_alt92

chat_bubble_outline2

repeat27

shareShare

Nikhil Keetha

@nik__v__

3 months ago

Meet MapAnything – a transformer that directly regresses factored metric 3D scene geometry (from images, calibration, poses, or depth) in an end-to-end way. No pipelines, no extra stages. Just 3D geometry & cameras, straight from any type of input, delivering new state-of-the-art

thumb_up_off_alt682

chat_bubble_outline28

repeat122

shareShare