Kaiwen Zhou (@kaiwenzhou9) Twitter Tweets • TwiCopy

Kaiwen Zhou

@kaiwenzhou9

+ Follow

A CSE PhD student in @ucsc, working on multimodal AI Agent and responsible AI. Previous: @Samsung_RA, @hri_usa. Looking for summer intern 2025.

ID: 1506422274307416069

calendar_today23-03-2022 00:07:29

80 Tweet

184 Takipçi

197 Takip Edilen

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

New Paper Alert: Multimodal Inconsistency Reasoning (MMIR)! ✨ Ever visited a webpage where the text says “IKEA desk” yet images and descriptions elsewhere show a totally different brand? Or read a slide that shows “50% growth” in the text but the accompanying chart looks flat?

thumb_up_off_alt30

chat_bubble_outline1

repeat9

shareShare

Xin Eric Wang @ ICLR 2025

@xwang_lk

5 months ago

𝐁𝐞𝐚𝐭𝐢𝐧𝐠 𝐎𝐩𝐞𝐧𝐀𝐈 𝐢𝐬 𝐧𝐨𝐭 𝐚𝐬 𝐡𝐚𝐫𝐝 𝐚𝐬 𝐲𝐨𝐮 𝐭𝐡𝐢𝐧𝐤. If you don't believe you can compete, you've already lost. Winning starts with mindset. 🚀Introducing 𝑨𝒈𝒆𝒏𝒕 𝑺2, 𝐭𝐡𝐞 𝐰𝐨𝐫𝐥𝐝'𝐬 𝐛𝐞𝐬𝐭 𝐜𝐨𝐦𝐩𝐮𝐭𝐞𝐫-𝐮𝐬𝐞 𝐚𝐠𝐞𝐧𝐭, and the second

thumb_up_off_alt546

chat_bubble_outline30

repeat87

shareShare

Kaiwen Zhou

@kaiwenzhou9

3 months ago

Could not attend #ICLR2025 🥲. But Chengzhi Liu will present our Multimodal Situational Safety paper on April 25, 3:00-5:30 pm in Hall 3 + Hall 2B #538. Welcome to check it out!

Could not attend #ICLR2025 🥲. But <a href="/liuchen02938149/">Chengzhi Liu</a> will present our Multimodal Situational Safety paper on April 25, 3:00-5:30 pm in Hall 3 + Hall 2B #538. Welcome to check it out!

thumb_up_off_alt17

chat_bubble_outline0

repeat2

shareShare

Yue Fan

@yfan_ucsc

2 months ago

Before o3 impressed everyone with 🔥visual reasoning🔥, we already had faith in and were exploring models that can think with images. 🚀 Here’s our shot, GRIT: Grounded Reasoning with Images & Texts that trains MLLMs to think while performing visual grounding. It is done via RL

thumb_up_off_alt165

chat_bubble_outline3

repeat36

shareShare

Kaiwen Zhou

@kaiwenzhou9

2 months ago

Still remember people commented on the close DDLS between ARR May and Neurips. Just check the ARR Oct ddl is Oct 6th🤔

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Chengzhi Liu

@liuchen02938149

2 months ago

🧠 More Thinking, Less Seeing? 👀 Exploring the Balance Between Reasoning and Hallucination in Multimodal Reasoning Models! Currently many multimodal reasoning models while striving for enhanced reasoning capabilities often neglect the issue of visual hallucinations. While

thumb_up_off_alt46

chat_bubble_outline2

repeat22

shareShare

Jing Gu

@jinggu4ai

15 days ago

🚨 PhyWorldBench, New Paper Alert! 🚨 Video‑generation models are jaw‑dropping—they conjure gorgeous scenes in seconds. But can they truly simulate the real world, respecting (or intentionally bending) the laws of physics? Introducing PhyWorldBench, the large‑scale benchmark I

thumb_up_off_alt16

chat_bubble_outline1

repeat2

shareShare

Kaiwen Zhou

Gate.io

Qianqi "Jackie" Yan

Xin Eric Wang @ ICLR 2025

Kaiwen Zhou

Yue Fan

Kaiwen Zhou

Chengzhi Liu

Jing Gu