Andrew Liao (@andrewliao11) Twitter Tweets • TwiCopy

Andrew Liao

@andrewliao11

+ Follow

****Seeking research roles in AI/CV for 2025
Final-yr Ph.D. at CS @UofT @VectorInst 🇨🇦. I make dataset creation less painful.
Prev. @nvidia @amazon intern

ID: 4714259977

linkhttps://andrewliao11.github.io calendar_today05-01-2016 15:36:30

461 Tweet

252 Takipçi

647 Takip Edilen

Xindi Wu

@cindy_x_wu

7 months ago

Introducing COMPACT: COMPositional Atomic-to-complex Visual Capability Tuning, a data-efficient approach to improve multimodal models on complex visual tasks without scaling data volume. 📦 arxiv.org/abs/2504.21850 1/10

thumb_up_off_alt149

chat_bubble_outline6

repeat42

shareShare

Andrew Liao

@andrewliao11

7 months ago

When developing this project, it keeps reminding of DAgger (classic imitation learning algo) We first let the model to free generate the imperfect data. Once we detect something wrong, we hand it over to an expert model to fix errors.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Andrew Liao

@andrewliao11

6 months ago

Released the code! Go generate your own Long Perceptual Thoughts. 🧑‍💻Code: github.com/andrewliao11/L…

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Yung-Sung Chuang

@yungsungchuang

6 months ago

🚨Do passage rerankers really need explicit reasoning?🤔—Maybe Not! Our findings: ⚖️Standard rerankers outperform those w/ step-by-step reasoning! 🚫Disable reasoning from reasoning reranker actually improves reranking accuracy!🤯 👇But, why? 📰arxiv.org/abs/2505.16886 (1/6)

thumb_up_off_alt54

chat_bubble_outline1

repeat16

shareShare

Donglai Xiang

@donglaixiang

6 months ago

🚨Excited to announce the 1st Workshop on Vision Meets Physics at @CVPR2025! Join us on June 12 for a full-day event exploring the synergy between physical simulation & computer vision to bridge the gap between the virtual and physical worlds. URL: tinyurl.com/vis-phys

thumb_up_off_alt107

chat_bubble_outline2

repeat14

shareShare

Anne Ouyang

@anneouyang

6 months ago

✨ New blog post 👀: We have some very fast AI-generated kernels generated with a simple test-time only search. They are performing close to or in some cases even beating the standard expert-optimized production kernels shipped in PyTorch. (1/6) [🔗 link in final post]

thumb_up_off_alt776

chat_bubble_outline25

repeat95

shareShare

Shenzhi Wang🌟

@shenzhiwang_thu

6 months ago

🧐Two papers, opposite opinions. Ours: High-entropy tokens drive all performance gains in LLM RL. Another: Don’t let low-prob (often high-entropy) tokens over-dominate. Both are valid. Why? 💡Model size matters. Larger LLMs support our view; smaller LLMs support theirs. 🧵⬇️

thumb_up_off_alt515

chat_bubble_outline7

repeat71

shareShare

Ludwig Schmidt

@lschmidt3

5 months ago

Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat208

shareShare

Andrew Liao

@andrewliao11

5 months ago

Excited to share our CVPR paper next week in Nashville 🎶! Looking forward to connecting with old/new friends. Also, I'm on the job market NOW. Let's discuss system-2 thinking in VLMs! 🤔🤓💡 #CVPR2025 #Nashville #VLMs #reasoning

thumb_up_off_alt24

chat_bubble_outline0

repeat4

shareShare

Jun Gao

@jungao33210520

5 months ago

This year, we have 3 papers in CVPR, discussing the connection between 3D and video models: GEN3C [Highlight] 3D grounding for video model DiffusionRenderer [Oral] Taming video models for rendering and inverse rendering Diffix3D+ [Oral] Enhancing Nerf/3DGS w/ diffusion models

thumb_up_off_alt121

chat_bubble_outline4

repeat13

shareShare

Federico Baldassarre

@baldassarrefe

5 months ago

DINOv2 meets text at #CVPR 2025! Why choose between high-quality DINO features and CLIP-style vision-language alignment? Pick both with dino.txt 🦖📖 We align frozen DINOv2 features with text captions, obtaining both image-level and patch-level alignment at a minimal cost. [1/N]

thumb_up_off_alt675

chat_bubble_outline4

repeat105

shareShare

Mingyuan Wu

@mingyuanwu4

5 months ago

Research with amazing collaborators Jize Jiang, Meitang Li, and Jingcheng Yang, guided by great advisors and supported by the generous help of talented researchers Bowen Jin, Xingyu Fu ✈️ ICML25, and many open-source contributors (easyr1, verl, vllm... etc).

thumb_up_off_alt29

chat_bubble_outline0

repeat14

shareShare

Andrew Liao

@andrewliao11

4 months ago

🎉 Great news! "LongPerceptualThoughts" accepted at @COLMConf! #montreal #COLM2025

thumb_up_off_alt10

chat_bubble_outline0

repeat3

shareShare

Yung-Sung Chuang

@yungsungchuang

4 months ago

Scaling CLIP on English-only data is outdated now… 🌍We built CLIP data curation pipeline for 300+ languages 🇬🇧We train MetaCLIP 2 without compromising English-task performance (it actually improves! 🥳It’s time to drop the language filter! 📝arxiv.org/abs/2507.22062 [1/5] 🧵

thumb_up_off_alt290

chat_bubble_outline3

repeat80

shareShare

Huck Yang 🇸🇬 ICLR 2025

@huckiyang

4 months ago

NeKo (ネコ) aims to be your pet model to work with ASR/AST/OCR NVIDIA AI Developer Yenting Lin et al. - back to 2020 conformer-transducer is dominant; people were not very interested in working in ASR-LM (i.e., internal LM of ASR / contextual biasing were popular) appreciate Andreas

thumb_up_off_alt26

chat_bubble_outline1

repeat3

shareShare