Andrew Liao (@andrewliao11) 's Twitter Profile
Andrew Liao

@andrewliao11

****Seeking research roles in AI/CV for 2025
Final-yr Ph.D. at CS @UofT @VectorInst 🇨🇦. I make dataset creation less painful.
Prev. @nvidia @amazon intern

ID: 4714259977

linkhttps://andrewliao11.github.io calendar_today05-01-2016 15:36:30

461 Tweet

252 Followers

647 Following

Xindi Wu (@cindy_x_wu) 's Twitter Profile Photo

Introducing COMPACT: COMPositional Atomic-to-complex Visual Capability Tuning, a data-efficient approach to improve multimodal models on complex visual tasks without scaling data volume. 📦 arxiv.org/abs/2504.21850 1/10

Introducing COMPACT: COMPositional Atomic-to-complex Visual Capability Tuning, a data-efficient approach to improve multimodal models on complex visual tasks without scaling data volume. 📦

arxiv.org/abs/2504.21850

1/10
Andrew Liao (@andrewliao11) 's Twitter Profile Photo

When developing this project, it keeps reminding of DAgger (classic imitation learning algo) We first let the model to free generate the imperfect data. Once we detect something wrong, we hand it over to an expert model to fix errors.

Yung-Sung Chuang (@yungsungchuang) 's Twitter Profile Photo

🚨Do passage rerankers really need explicit reasoning?🤔—Maybe Not! Our findings: ⚖️Standard rerankers outperform those w/ step-by-step reasoning! 🚫Disable reasoning from reasoning reranker actually improves reranking accuracy!🤯 👇But, why? 📰arxiv.org/abs/2505.16886 (1/6)

🚨Do passage rerankers really need explicit reasoning?🤔—Maybe Not!

Our findings:
⚖️Standard rerankers outperform those w/ step-by-step reasoning!
🚫Disable reasoning from reasoning reranker actually improves reranking accuracy!🤯
👇But, why?

📰arxiv.org/abs/2505.16886

(1/6)
Donglai Xiang (@donglaixiang) 's Twitter Profile Photo

🚨Excited to announce the 1st Workshop on Vision Meets Physics at @CVPR2025! Join us on June 12 for a full-day event exploring the synergy between physical simulation & computer vision to bridge the gap between the virtual and physical worlds. URL: tinyurl.com/vis-phys

🚨Excited to announce the 1st Workshop on Vision Meets Physics at @CVPR2025!

Join us on June 12 for a full-day event exploring the synergy between physical simulation & computer vision to bridge the gap between the virtual and physical worlds.

URL: tinyurl.com/vis-phys
Anne Ouyang (@anneouyang) 's Twitter Profile Photo

✨ New blog post 👀: We have some very fast AI-generated kernels generated with a simple test-time only search. They are performing close to or in some cases even beating the standard expert-optimized production kernels shipped in PyTorch. (1/6) [🔗 link in final post]

✨ New blog post 👀: We have some very fast AI-generated kernels generated with a simple test-time only search. They are performing close to or in some cases even beating the standard expert-optimized production kernels shipped in PyTorch. (1/6)

[🔗 link in final post]
Shenzhi Wang🌟 (@shenzhiwang_thu) 's Twitter Profile Photo

🧐Two papers, opposite opinions. Ours: High-entropy tokens drive all performance gains in LLM RL. Another: Don’t let low-prob (often high-entropy) tokens over-dominate. Both are valid. Why? 💡Model size matters. Larger LLMs support our view; smaller LLMs support theirs. 🧵⬇️

🧐Two papers, opposite opinions.

Ours: High-entropy tokens drive all performance gains in LLM RL.

Another: Don’t let low-prob (often high-entropy) tokens over-dominate.

Both are valid. Why?
💡Model size matters. Larger LLMs support our view; smaller LLMs support theirs.

🧵⬇️
Ludwig Schmidt (@lschmidt3) 's Twitter Profile Photo

Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.

Very excited to finally release our paper for OpenThoughts!

After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.
Andrew Liao (@andrewliao11) 's Twitter Profile Photo

Excited to share our CVPR paper next week in Nashville 🎶! Looking forward to connecting with old/new friends. Also, I'm on the job market NOW. Let's discuss system-2 thinking in VLMs! 🤔🤓💡 #CVPR2025 #Nashville #VLMs #reasoning

Jun Gao (@jungao33210520) 's Twitter Profile Photo

This year, we have 3 papers in CVPR, discussing the connection between 3D and video models: GEN3C [Highlight] 3D grounding for video model DiffusionRenderer [Oral] Taming video models for rendering and inverse rendering Diffix3D+ [Oral] Enhancing Nerf/3DGS w/ diffusion models

This year, we have 3 papers in CVPR, discussing the connection between 3D and video models:

GEN3C [Highlight] 3D grounding for video model

DiffusionRenderer [Oral] Taming video models for rendering and inverse rendering

Diffix3D+ [Oral] Enhancing Nerf/3DGS w/ diffusion models
Federico Baldassarre (@baldassarrefe) 's Twitter Profile Photo

DINOv2 meets text at #CVPR 2025! Why choose between high-quality DINO features and CLIP-style vision-language alignment? Pick both with dino.txt 🦖📖 We align frozen DINOv2 features with text captions, obtaining both image-level and patch-level alignment at a minimal cost. [1/N]

DINOv2 meets text at #CVPR 2025! Why choose between high-quality DINO features and CLIP-style vision-language alignment? Pick both with dino.txt 🦖📖

We align frozen DINOv2 features with text captions, obtaining both image-level and patch-level alignment at a minimal cost. [1/N]
Mingyuan Wu (@mingyuanwu4) 's Twitter Profile Photo

Research with amazing collaborators Jize Jiang, Meitang Li, and Jingcheng Yang, guided by great advisors and supported by the generous help of talented researchers Bowen Jin, Xingyu Fu ✈️ ICML25, and many open-source contributors (easyr1, verl, vllm... etc).

Yung-Sung Chuang (@yungsungchuang) 's Twitter Profile Photo

Scaling CLIP on English-only data is outdated now… 🌍We built CLIP data curation pipeline for 300+ languages 🇬🇧We train MetaCLIP 2 without compromising English-task performance (it actually improves! 🥳It’s time to drop the language filter! 📝arxiv.org/abs/2507.22062 [1/5] 🧵

Scaling CLIP on English-only data is outdated now…

🌍We built CLIP data curation pipeline for 300+ languages
🇬🇧We train MetaCLIP 2 without compromising English-task performance (it actually improves!
🥳It’s time to drop the language filter!

📝arxiv.org/abs/2507.22062

[1/5]

🧵
Huck Yang 🇸🇬 ICLR 2025 (@huckiyang) 's Twitter Profile Photo

NeKo (ネコ) aims to be your pet model to work with ASR/AST/OCR NVIDIA AI Developer Yenting Lin et al. - back to 2020 conformer-transducer is dominant; people were not very interested in working in ASR-LM (i.e., internal LM of ASR / contextual biasing were popular) appreciate Andreas

Andrew Liao (@andrewliao11) 's Twitter Profile Photo

Was studying various GRPO variants here and there, DAPO, Dr.GRPO, GSPO, etc. and this paper provides a holistic view of PPO-GRPO family