David Wan (@meetdavidwan) Twitter Tweets • TwiCopy

David Wan

@meetdavidwan

+ Follow

PhD student at UNC-Chapel Hill (@uncnlp), advised by @mohitban47. @Google PhD Fellow. @AmazonScience, @MetaAI, and @SFResearch intern.

ID: 1101184093524553729

linkhttps://meetdavidwan.github.io/ calendar_today28-02-2019 18:15:21

178 Tweet

452 Followers

470 Following

Elias Stengel-Eskin (on the faculty job market)

@eliaseskin

6 months ago

Excited to announce CLaMR, our new retriever for multimodal documents! Strong performance improvements (+25 nDGC@10) compared to both multimodal and unimodal retrieval baselines. 🤝 CLaMR jointly encodes multiple modalities and selects the most relevant ones for each query. 🏋️‍♂️

thumb_up_off_alt22

chat_bubble_outline0

repeat9

shareShare

Han Wang

@hanwang98

6 months ago

How can a multimodal retriever accurately retrieve docs from massive online video content that spans multiple modalities? We introduce CLaMR, a contextualized late-interaction retriever that jointly encodes all modalities and dynamically selects those containing the relevant

thumb_up_off_alt18

chat_bubble_outline1

repeat8

shareShare

Jaemin Cho (on faculty job market)

@jmin__cho

6 months ago

Introducing CLaMR -- a late-interaction retriever for complex multimodal video content! 📽️📚 ➡️ Jointly encodes frames, speech, on-screen text, and metadata to answer diverse queries grounded across modalities ➡️ Trained with a new dataset we introduce, MultiVENT 2.0++, a

thumb_up_off_alt33

chat_bubble_outline0

repeat8

shareShare

Arie Cattan

@ariecattan

6 months ago

🚨 RAG is a popular approach but what happens when the retrieved sources provide conflicting information?🤔 We're excited to introduce our paper: “DRAGged into CONFLICTS: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs”🚀 A thread 🧵👇

thumb_up_off_alt30

chat_bubble_outline2

repeat14

shareShare

Ziyang Wang

@ziyangw00

6 months ago

Excited to present VideoTree🌲 at #CVPR2025 Fri at 10:30AM! VideoTree improves long-video QA via smart sampling: -Query-adaptive: finds the parts of the video relevant to the query -Coarse-to-fine structure: structured hierarchically to sample granularly from relevant segments

thumb_up_off_alt37

chat_bubble_outline1

repeat18

shareShare

David Wan

@meetdavidwan

6 months ago

Thanks for the discovering + sharing our work on contextualized late-interaction based multimodal content retrieval, Omar! (and ColBERT is awesome of course) 😀

thumb_up_off_alt23

chat_bubble_outline1

repeat7

shareShare

Elias Stengel-Eskin (on the faculty job market)

@eliaseskin

6 months ago

🚨 Excited to announce GenerationPrograms (GP) which generates inherently attributed text by asking LLMs to produce a program that executes to text. Following the program trace gives us a causal understanding of how the text was generated, with major benefits: ➡️ Attribution

thumb_up_off_alt14

chat_bubble_outline0

repeat10

shareShare

Eran Hirsch

@hirscheran

6 months ago

In RAG applications, self-citation methods are prone to make attribution mistakes because there is no inductive bias for LLMs to track which source supports each statement. We propose GenerationPrograms: first generate a clear plan, then use that plan to guide generation. That

thumb_up_off_alt11

chat_bubble_outline0

repeat6

shareShare

David Wan

@meetdavidwan

5 months ago

🎉 Our paper, GenerationPrograms, which proposes a modular framework for attributable text generation, has been accepted to Conference on Language Modeling! GenerationPrograms produces a program that executes to text, providing an auditable trace of how the text was generated and major gains on

thumb_up_off_alt37

chat_bubble_outline0

repeat24

shareShare

Han Lin

@hanlin_hl

4 months ago

🤔 Can we bridge MLLMs and diffusion models more natively and efficiently, by having MLLMs produce patch-level CLIP latents already aligned with their visual encoders, while fully preserving MLLM's visual reasoning capabilities? Introducing Bifrost-1: 🌈 > High-Fidelity

thumb_up_off_alt131

chat_bubble_outline2

repeat47

shareShare

Jaemin Cho (on faculty job market)

@jmin__cho

4 months ago

📢 Introducing RotBench, which tests whether SoTA MLLMs (e.g., GPT-5, GPT-4o, o3, Gemini-2.5-pro) can identify the rotation of input images (0°, 90°, 180°, and 270°). Even frontier MLLMs struggle at this spatial reasoning task that humans solve with >98% Acc. ➡️ Models struggle

thumb_up_off_alt85

chat_bubble_outline2

repeat37

shareShare

Ziyang Wang

@ziyangw00

4 months ago

🎉Our Video-RTS paper has been accepted at #EMNLP2025 Main!! We propose a novel video reasoning approach that combines data-efficient reinforcement learning (GRPO) with video-adaptive test-time scaling, improving reasoning performance while maintaining efficiency on multiple

thumb_up_off_alt39

chat_bubble_outline1

repeat28

shareShare

Justin Chih-Yao Chen

@cyjustinchen

4 months ago

Excited to share that MAgICoRe has been accepted to #EMNLP2025 main! 🎉 Our work identifies 3 key challenges in LLM refinement for reasoning: 1) Over-correction on easy problems 2) Fail to localize and fix its own errors 3) Too few refinement iterations for harder problems

thumb_up_off_alt98

chat_bubble_outline0

repeat36

shareShare