Kunhao Zheng @ ICLR 2025 (@kunhaoz) 's Twitter Profile
Kunhao Zheng @ ICLR 2025

@kunhaoz

École Polytechnique X18, SJTU. Now in the amazing FAIR CodeGen @AIatMeta. Alumni: @Huggingface, Sea AI Lab, intern @openai

ID: 1087607633823952898

calendar_today22-01-2019 07:07:21

131 Tweet

538 Followers

529 Following

Delong Chen (陈德龙) (@delong0_0) 's Twitter Profile Photo

This is my first paper done at FAIR. We show that adaptive visual token segmentation, especially in subobject-level (i.e., subwords in images), enables VLMs to have a better and faster learning of image understanding! arxiv.org/pdf/2402.14327

This is my first paper done at FAIR. We show that adaptive visual token segmentation, especially in subobject-level (i.e., subwords in images), enables VLMs to have a better and faster learning of image understanding!
arxiv.org/pdf/2402.14327
Krunoslav Lehman Pavasovic (@krunolehman) 's Twitter Profile Photo

1/ Happy to share my first accepted paper as a PhD student at Meta and École normale supérieure | PSL which I will present at ICLR 2026: 📚 Our work proposes difFOCI, a novel rank-based objective for ✨better feature learning✨ In collab with David Lopez-Paz, Giulio Biroli and Levent Sagun!

AI at Meta (@aiatmeta) 's Twitter Profile Photo

📷 Hello Singapore! Meta is at #ICLR2025 EXPO 📷 Meta will be in Singapore this week for #ICLR25! Stop by our booth to chat with our team or learn more about our latest research. Things to know: 📷 Find us @ Booth #L03 (Rows 3-4, Columns L-M) in Hall 2. 📷 We're sharing 50+

📷  Hello Singapore! Meta is at #ICLR2025 EXPO  📷
Meta will be in Singapore this week for #ICLR25! Stop by our booth to chat with our team or learn more about our latest research.

Things to know:
📷 Find us @ Booth #L03 (Rows 3-4, Columns L-M) in Hall 2.
📷 We're sharing 50+
Kunhao Zheng @ ICLR 2025 (@kunhaoz) 's Twitter Profile Photo

#ICLR2025 Come say hi at our Fri 25 Apr poster session: "What Makes Large Language Models Reason in (Multi-Turn) Code Generation?" 📝✨ 📍 Hall 3 + Hall 2B #263 🕙 Fri 25 Apr, 10 a.m.–12:30 p.m. +08 link: iclr.cc/virtual/2025/p… paper: arxiv.org/abs/2410.08105

#ICLR2025 
Come say hi at our Fri 25 Apr poster session:

"What Makes Large Language Models Reason in (Multi-Turn) Code Generation?" 📝✨

📍 Hall 3 + Hall 2B #263
🕙 Fri 25 Apr, 10 a.m.–12:30 p.m. +08
link: iclr.cc/virtual/2025/p…
paper: arxiv.org/abs/2410.08105
Kunhao Zheng @ ICLR 2025 (@kunhaoz) 's Twitter Profile Photo

#ICLR2025 Come say hi at our Sat 26 Apr poster session: "The KoLMogorov Test: Compression by Code Generation" 📝✨ 📍Hall 3 + Hall 2B #557 🕙Sat 26 Apr 10 a.m. - 12:30 p.m. +08 repo: github.com/facebookresear… paper: arxiv.org/abs/2503.13992…

#ICLR2025  Come say hi at our Sat 26 Apr poster session: 

"The KoLMogorov Test: Compression by Code Generation" 📝✨

📍Hall 3 + Hall 2B #557
🕙Sat 26 Apr 10 a.m. - 12:30 p.m. +08

repo: github.com/facebookresear…
paper: arxiv.org/abs/2503.13992…
Yunzhen Feng (@feeelix_feng) 's Twitter Profile Photo

Check out our poster tmr at 10am at the ICLR Bidirectional Human-AI Alignment workshop! We cover how on-policy preference sampling can be biased and our optimal response sampling for human labeling. NYU Center for Data Science AI at Meta Julia Kempe Yaqi Duan x.com/feeelix_feng/s…

Check out our poster tmr at 10am at the ICLR Bidirectional Human-AI Alignment workshop! We cover how on-policy preference sampling can be biased and our optimal response sampling for human labeling.
<a href="/NYUDataScience/">NYU Center for Data Science</a>
<a href="/AIatMeta/">AI at Meta</a>
<a href="/KempeLab/">Julia Kempe</a>
<a href="/YaqiDuanPKU/">Yaqi Duan</a>
x.com/feeelix_feng/s…
Kunhao Zheng @ ICLR 2025 (@kunhaoz) 's Twitter Profile Photo

🚨 Your RL only improves 𝗽𝗮𝘀𝘀@𝟭, not 𝗽𝗮𝘀𝘀@𝗸? 🚨 That’s not a bug — it’s a 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗼𝗳 𝘁𝗵𝗲 𝗼𝗯𝗷𝗲𝗰𝘁𝗶𝘃𝗲 you’re optimizing. You get what you optimize for. If you want better pass@k, you need to optimize for pass@k at training time. 🧵 How?

🚨 Your RL only improves 𝗽𝗮𝘀𝘀@𝟭, not 𝗽𝗮𝘀𝘀@𝗸? 🚨

That’s not a bug — it’s a 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗼𝗳 𝘁𝗵𝗲 𝗼𝗯𝗷𝗲𝗰𝘁𝗶𝘃𝗲 you’re optimizing.

You get what you optimize for. If you want better pass@k, you need to optimize for pass@k at training time.

🧵 How?
Kunhao Zheng @ ICLR 2025 (@kunhaoz) 's Twitter Profile Photo

❄️Andrew Zhao❄️ Yeah we are doing it and it’s called Soft Policy Optimization: arxiv.org/abs/2503.05453 It can learn from arbitrary on/off policy samples. TLDR it reparametrizes Q function by your LLM. An elegant property: Belleman equation satisfied by construction so not separate TD loss.

Kunhao Zheng @ ICLR 2025 (@kunhaoz) 's Twitter Profile Photo

Let’s be clear. If you use schulman k3 estimator y’all optimizing for another thing: the reverse KL. Funnily people think schulman k2 estimator is biased, but it gives the right gradient.

Mathurin Videau (@mathuvu_) 's Twitter Profile Photo

We present an Autoregressive U-Net that incorporates tokenization inside the model, pooling raw bytes into words then word-groups. AU-Net focuses most of its compute on building latent vectors that correspond to larger units of meaning. Joint work with Badr Youbi Idrissi 1/8

We present an Autoregressive U-Net that incorporates tokenization inside the model, pooling raw bytes into words then word-groups. AU-Net focuses most of its compute on building latent vectors that correspond to larger units of meaning.
Joint work with <a href="/byoubii/">Badr Youbi Idrissi</a> 1/8