Kai-Wei Chang (@kaiwei_chang) 's Twitter Profile
Kai-Wei Chang

@kaiwei_chang

Associate Professor @UCLAengineering/@UCLA. Area: #NLProc/#ML/#AI kwchang.net

ID: 716744871403528192

linkhttp://kwchang.net calendar_today03-04-2016 21:51:08

849 Tweet

7,7K Followers

712 Following

Da Yin (@wade_yin9712) 's Twitter Profile Photo

🧐Deepseek-R1 is promising, but its long CoT lacks the grounding to interactive agent environments and generation efficiency. 🥳Our QLASS provides a more grounded PRM-based method that uses learned Q-value to quickly adapt to the new environments, and supports more efficient

Yonatan Bitton (@yonatanbitton) 's Twitter Profile Photo

Check out VideoPhy-2—the next edition in our series testing physical commonsense in generated videos! 🌎🤖 Can your models accurately simulate real-world physics? Challenge them here: videophy2.github.io Excited to co-lead this with Hritik Bansal & Clark Peng 🚀

Yihe Deng (@yihe__deng) 's Twitter Profile Photo

🚀Excited to share our latest work: OpenVLThinker, an exploration into enhancing vision-language models with R1 reasoning capabilities. By iterative integration of SFT and RL, we enabled LVLMs to exhibit robust R1 reasoning behavior. As a result, OpenVLThinker achieves a 70.2%

🚀Excited to share our latest work: OpenVLThinker, an exploration into enhancing vision-language models with R1 reasoning capabilities. 

By iterative integration of SFT and RL, we enabled LVLMs to exhibit robust R1 reasoning behavior. As a result, OpenVLThinker achieves a 70.2%
Hritik Bansal (@hbxnov) 's Twitter Profile Photo

📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️ Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵

📢Scaling test-time compute via generative verification (GenRM) is an emerging paradigm and shown to be more efficient than self-consistency (SC) for reasoning. But, such claims are misleading☠️

Our compute-matched analysis shows that SC outperforms GenRM across most budgets! 🧵
uclanlp (@uclanlp) 's Twitter Profile Photo

🚨 New NLP seminar series alert! 🚨 Check out UCLA NLP Seminar series featuring cutting-edge talks from top researchers in NLP and related areas. Great lineup, timely topics, and open to all (zoom)! 🧠💬 📅 Schedule + details: uclanlp.github.io/nlp-seminar/

Salman (@salman1422571) 's Twitter Profile Photo

🚨 Excited to share our new paper on 𝕏-Teaming! 🤖 Multiagent system for multiturn jaibreaking 🔍 96.2% attack success against Claude 3.7 (immune to single-turn attacks!) 💥 Upto 98.1% attack success on leading model 🛡️ Released 30K safety dataset 🧵below #AI #LLMSafety

🚨 Excited to share our new paper on 𝕏-Teaming!

🤖 Multiagent system for multiturn jaibreaking

🔍 96.2% attack success against Claude 3.7 (immune to single-turn attacks!) 

💥 Upto 98.1% attack success on leading model

🛡️ Released 30K safety dataset

🧵below 
#AI #LLMSafety
Di Wu (@diwu0162) 's Twitter Profile Photo

Attending NAACL to present BRIEF (Friday 11am, hall 3) and Self-Routing RAG (KnowledgeNLP Workshop). Looking forward to meeting new and old friends!

Attending NAACL to present BRIEF (Friday 11am, hall 3) and Self-Routing RAG (KnowledgeNLP Workshop). Looking forward to meeting new and old friends!
Yu (Bryan) Zhou (@yu_bryan_zhou) 's Twitter Profile Photo

#GPT4o image generation brings synthetic visual data quality to the next level. 🖼️ 🤔Is synthetic visual data finally ready to be used for improving VLMs? 🚀 We show success with CoDA, using contrastive visual data augmentation to help teach VLMs novel and confusing concepts.

trustnlp (@trustnlp) 's Twitter Profile Photo

Welcome to the TrustNLP workshop at NAACL25 We are in Rm 215 - San Miguel. Schedule can be found here: trustnlpworkshop.github.io Virtual site access: underline.io/events/492/ses…

uclanlp (@uclanlp) 's Twitter Profile Photo

For this week’s NLP Seminar, we are thrilled to host Emma Pierson  Emma Pierson to give a talk titled Using New Data to Answer Old Questions! When: 5/16 Fri 2pm PT Registration: forms.gle/9sNYv2isfcqYQC…

For this week’s NLP Seminar, we are thrilled to host Emma Pierson  <a href="/2plus2make5/">Emma Pierson</a> to give a talk titled 
Using New Data to Answer Old Questions!

When: 5/16 Fri 2pm PT
Registration: forms.gle/9sNYv2isfcqYQC…
Haoyi Qiu (@haoyiqiu) 's Twitter Profile Photo

🌏How culturally safe are large vision-language models? 👉LVLMs often miss the mark. We introduce CROSS, a benchmark of 1,284 image-query pairs across 16 countries & 14 languages, revealing how LVLMs violate cultural norms in context. ⚖️ Evaluation via CROSS-EVAL 🧨 Safety

🌏How culturally safe are large vision-language models? 👉LVLMs often miss the mark.

We introduce CROSS, a benchmark of 1,284 image-query pairs across 16 countries &amp; 14 languages, revealing how LVLMs violate cultural norms in context.

⚖️ Evaluation via CROSS-EVAL
🧨 Safety
Di Wu (@diwu0162) 's Twitter Profile Photo

Visualized Text-to-Image Retrieval Text-to-image retrieval by imagining the text query in the image space. Website: xiaowu0162.github.io/visret/ (1/N)

Visualized Text-to-Image Retrieval

Text-to-image retrieval by imagining the text query in the image space.

Website: xiaowu0162.github.io/visret/

(1/N)
Wenbo Hu@ICLR🇸🇬 (@gordonhu608) 's Twitter Profile Photo

🤔How to maintain a long-term memory for a 3D embodied AI agent across dynamic spatial-temporal environment changes in complex tasks? 🚀Introducing 3DLLM-Mem, a memory-enhanced 3D embodied agent that incrementally builds and maintains a task-relevant long-term memory while it

🤔How to maintain a long-term memory for a 3D embodied AI agent across dynamic spatial-temporal environment changes in complex tasks? 

🚀Introducing 3DLLM-Mem, a memory-enhanced 3D embodied agent that incrementally builds and maintains a task-relevant long-term memory while it
Yonatan Bitton (@yonatanbitton) 's Twitter Profile Photo

Heading to CVPR 2025 in Nashville next week 🎤 and the Google offices in Mountain View the week after 🌉Always happy to connect over multimodal research 🤝 Also, come hear Yining Hong present our work 3D-LLM at the FMEA workshop (Wed 9:55am) 🔗 arxiv.org/abs/2505.22657

Rahul Gupta (@rahul1987iit) 's Twitter Profile Photo

We're hiring interns in the Nova RAI team to work on cutting-edge responsible AI research! 🤖🧠 Looking for PhD students with a strong publication record. Position is in Boston. DM me if interested! Tagging for reach: Kai-Wei Chang , Aram Galstyan #responsibleAI, #amazonNova

Tanmay Parekh (@tparekh97) 's Twitter Profile Photo

🚨 New work: LLMs still struggle at Event Detection due to poor long-context reasoning and inability to follow task constraints, causing precision and recall errors. We introduce DiCoRe — a lightweight 3-stage Divergent-Convergent reasoning framework to fix this.🧵📷 (1/N)

🚨 New work: LLMs still struggle at Event Detection due to poor long-context reasoning and inability to follow task constraints, causing precision and recall errors.  

We introduce DiCoRe — a lightweight 3-stage Divergent-Convergent reasoning framework to fix this.🧵📷 (1/N)