Xingyu Fu (@xingyufu2) 's Twitter Profile
Xingyu Fu

@xingyufu2

PhD student @Penn @cogcomp. | Focused on Vision+Language | Previous: @MSFTResearch @AmazonScience B.S. @UofIllinois | ⛳️😺

ID: 1305996908264075270

linkhttps://zeyofu.github.io/ calendar_today15-09-2020 22:28:30

116 Tweet

879 Takipçi

514 Takip Edilen

Fei Wang (@fwang_nlp) 's Twitter Profile Photo

𝗠𝘂𝗶𝗿𝗕𝗲𝗻𝗰𝗵 is officially accepted at #ICLR2025! 🎉 Recent VLMs/MLLMs such as LLaVA-OneVision, MM1.5, and MAmmoTH-VL have demonstrated significant progress on MuirBench.🚀 Excited to see how MuirBench continues to drive the innovation of VLMs! #AI #MachineLearning #VLM

Sheng Zhang (@sheng_zh) 's Twitter Profile Photo

Muirbench has been accepted to #ICLR2025! 🚀 Companies like Apple, TikTok, and Salesforce are already evaluating their LMMs on its multi-image setup—a robust testbed for multimodal reasoning. GenAI needs more benchmarks like this.🤯 Kudos to Fei Wang, Xingyu Fu ✈️ ICML25, and team! 👏

Xiaodong Yu (@xiaodong_yu_126) 's Twitter Profile Photo

Check our new paper on long context understanding! We use AgenticLU to significantly improve base model’s long contex performance (+14.7% avg on several datasets) without any scaling in the real inference time!

Yushi Hu (@huyushi98) 's Twitter Profile Photo

Excited to see the image reasoning in o3 and o4-mini!!🤩 We introduced this idea a year ago in visual Sketchpad (visualsketchpad.github.io). Excited to see OpenAI baking this into their model through agentic RL. Great work! And yes, reasoning should be multimodal! Huge shoutout

Weijia Shi (@weijiashi2) 's Twitter Profile Photo

Our previous work showed that 𝐜𝐫𝐞𝐚𝐭𝐢𝐧𝐠 𝐯𝐢𝐬𝐮𝐚𝐥 𝐜𝐡𝐚𝐢𝐧‑𝐨𝐟‑𝐭𝐡𝐨𝐮𝐠𝐡𝐭𝐬 𝐯𝐢𝐚 𝐭𝐨𝐨𝐥 𝐮𝐬𝐞 significantly boosts GPT‑4o’s visual reasoning performance. Excited to see this idea incorporated into OpenAI’s o3 and o4‑mini models (openai.com/index/thinking…).

Yu Feng (@anniefeng6) 's Twitter Profile Photo

#ICLR2025 Oral LLMs often struggle with reliable and consistent decisions under uncertainty 😵‍💫 — largely because they can't reliably estimate the probability of each choice. We propose BIRD 🐦, a framework that significantly enhances LLM decision making under uncertainty. BIRD

#ICLR2025 Oral

LLMs often struggle with reliable and consistent decisions under uncertainty 😵‍💫 — largely because they can't reliably estimate the probability of each choice.

We propose BIRD 🐦, a framework that significantly enhances LLM decision making under uncertainty.

BIRD
Sayak Paul (@risingsayak) 's Twitter Profile Photo

Embedding a scientific basis in pre-trained T2I models can enhance the realism and consistency of the results. Cool work in "Science-T2I: Addressing Scientific Illusions in Image Synthesis" jialuo-li.github.io/Science-T2I-We…

Embedding a scientific basis in pre-trained T2I models can enhance the realism and consistency of the results. 

Cool work in "Science-T2I: Addressing Scientific Illusions in Image Synthesis"

jialuo-li.github.io/Science-T2I-We…
Jialuo Li (@jialuoli1007) 's Twitter Profile Photo

🚀 Introducing Science-T2I - Towards bridging the gap between AI imagination and scientific reality in image generation! [CVPR 2025] 📜 Paper: arxiv.org/abs/2504.13129 🌐 Project: jialuo-li.github.io/Science-T2I-Web 💻 Code: github.com/Jialuo-Li/Scie… 🤗 Dataset: huggingface.co/collections/Ji… 🔍

🚀 Introducing Science-T2I - Towards bridging the gap between AI imagination and scientific reality in image generation!  [CVPR 2025]  

📜 Paper: arxiv.org/abs/2504.13129
🌐 Project: jialuo-li.github.io/Science-T2I-Web
💻 Code: github.com/Jialuo-Li/Scie…
🤗 Dataset: huggingface.co/collections/Ji…

🔍
Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

This paper is interestingly thought- provoking for me. There is a chance, that it's easier to "align t2i model with real physics" in post-training. And let it learn to generate whatever (physically implausible) combinations in pretrain. As opposed to trying hard to come up with

Fei Wang (@fwang_nlp) 's Twitter Profile Photo

🎉 Excited to share that our paper, "MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding", will be presented at #ICLR2025!​ 📅 Date: April 24 🕒 Time: 3:00 PM 📍 Location: Hall 3 + Hall 2B #11 MuirBench challenges multimodal LLMs with diverse multi-image

🎉 Excited to share that our paper, "MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding", will be presented at #ICLR2025!​
📅 Date: April 24
🕒 Time: 3:00 PM
📍 Location: Hall 3 + Hall 2B #11
MuirBench challenges multimodal LLMs with diverse multi-image
Xingyu Fu (@xingyufu2) 's Twitter Profile Photo

Refocus🔍 Visual reasoning for Tables and Charts with Edits Happy to share ReFocus accepted at #ICML2025. We’ve open-sourced code and training data: zeyofu.github.io/ReFocus/ ReFocus enables multimodal LMs to better reason on Tables and Charts with visual edits. It also provides

Xingyu Fu (@xingyufu2) 's Twitter Profile Photo

😌Been wanting to post since March but waited for the graduation photo….Thrilled to finally share that I’ll be joining Princeton University as a postdoc Princeton PLI this August! Endless thanks to my incredible advisors and mentors from Penn, UW, Cornell, NYU, UCSB, USC,

😌Been wanting to post since March but waited for the graduation photo….Thrilled to finally share that I’ll be joining Princeton University as a postdoc <a href="/PrincetonPLI/">Princeton PLI</a> this August!

Endless thanks to my incredible advisors and mentors from Penn, UW, Cornell, NYU, UCSB, USC,
Mingyuan Wu (@mingyuanwu4) 's Twitter Profile Photo

Research with amazing collaborators Jize Jiang, Meitang Li, and Jingcheng Yang, guided by great advisors and supported by the generous help of talented researchers Bowen Jin, Xingyu Fu ✈️ ICML25, and many open-source contributors (easyr1, verl, vllm... etc).

Xiang Yue@ICLR2025🇸🇬 (@xiangyue96) 's Twitter Profile Photo

People are racing to push math reasoning performance in #LLMs—but have we really asked why? The common assumption is that improving math reasoning should transfer to broader capabilities in other domains. But is that actually true? In our study (arxiv.org/pdf/2507.00432), we

People are racing to push math reasoning performance in #LLMs—but have we really asked why? The common assumption is that improving math reasoning should transfer to broader capabilities in other domains. But is that actually true?

In our study (arxiv.org/pdf/2507.00432), we
Weijia Shi (@weijiashi2) 's Twitter Profile Photo

Can data owners & LM developers collaborate to build a strong shared model while each retaining data control? Introducing FlexOlmo💪, a mixture-of-experts LM enabling: • Flexible training on your local data without sharing it • Flexible inference to opt in/out your data

Xingyu Fu (@xingyufu2) 's Twitter Profile Photo

I will be in #ICML2025 next week and present #ReFocus on Tuesday afternoon. 📍 West Exhibition Hall B2-B3 #W-202 ⏱️ Tue 15 Jul 4:30 p.m. PDT - 7 p.m. PDT Happy to chat and connect! Feel free to DM 😁 ReFocus link: huggingface.co/datasets/ReFoc…

Yong Lin (@yong18850571) 's Twitter Profile Photo

(1/4)🚨 Introducing Goedel-Prover V2 🚨 🔥🔥🔥 The strongest open-source theorem prover to date. 🥇 #1 on PutnamBench: Solves 64 problems—with far less compute. 🧠 New SOTA on MiniF2F: * 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671B’s 82.4%. * 8B > 671B: Our 8B

(1/4)🚨 Introducing Goedel-Prover V2 🚨
🔥🔥🔥 The strongest open-source theorem prover to date.
🥇 #1 on PutnamBench: Solves 64 problems—with far less compute.
🧠 New SOTA on MiniF2F:
* 32B model hits 90.4% at Pass@32, beating DeepSeek-Prover-V2-671B’s 82.4%.
* 8B &gt; 671B: Our 8B