UIUC NLP (@uiuc_nlp) Twitter Tweets • TwiCopy

good girl

@goodgirlxsz

5 hours ago

🔥Telegram İfşa

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Accepted as NeurIPS2025 Spotlight! Existing large multimodal models (LMMs) have very poor visual understanding and reasoning over part-level attributes and affordances (only 5.9% gIoU). We developed novel part-centric LMMs to address these challenges arxiv.org/pdf/2505.20759

thumb_up_off_alt44

chat_bubble_outline0

repeat10

shareShare

Cheng Qian

@qiancheng1231

2 months ago

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores. 📄 Paper: arxiv.org/pdf/2509.19736 💻 Code: github.com/SalesforceAIRe…

thumb_up_off_alt218

chat_bubble_outline4

repeat45

shareShare

Alexi Gladstone

@alexiglad

2 months ago

*Human-Like* Creativity is perhaps the most out-of-reach task for modern LLMs I'm super excited to share our new work evaluating LLMs with a creativity framework! We develop a synthetic creativity task to measure LLMs' capabilities in generating novel, creative, combinations,

thumb_up_off_alt13

chat_bubble_outline0

repeat5

shareShare

EMNLP 2025

@emnlpmeeting

2 months ago

Please note that #EMNLP2025 volunteer notifications have been sent. If you haven’t received yours, please check your spam folder or contact the chairs at [email protected] as some email addresses were entered incorrectly in the form

thumb_up_off_alt31

chat_bubble_outline1

repeat4

shareShare

Bowen Jin

@bowenjin13

2 months ago

Very excited to see Tinker released by Thinking Machines! Even more thrilled that Search-R1 is featured as the tool-use application in Tinker’s recipe 👇 🔗 github.com/thinking-machi… When we first built Search-R1, we opened up everything—data, recipes, models, code, logs—and kept

thumb_up_off_alt113

chat_bubble_outline2

repeat11

shareShare

Kung-Hsiang Steeve Huang

@steeve__huang

2 months ago

Ever felt like your GUI agents are dragging their feet? 🧐The culprit? Crunching through endless streams of screenshots, especially in those marathon long-horizon tasks. Thrilled to unveil ⭐️ GUI-KV ⭐️— our plug-and-play powerhouse that taps into the spatial saliency within

thumb_up_off_alt34

chat_bubble_outline6

repeat12

shareShare

Se-woong (Sam) Lee

@sewoong_sam_lee

2 months ago

🚨 New paper alert at COLM 2025! 🚨 An interesting open problem for those into Sparse Autoencoders (SAEs): "Top-K activation constrains L0 (the number of non-zeros), but how do we obtain E[L0]?" This was even the very first limitation noted in Leo Gao’s recent paper. (1/8)

thumb_up_off_alt12

chat_bubble_outline2

repeat2

shareShare

Prashant Jayannavar

@p_jayannavar

2 months ago

🚨 New preprint out! 🚨 In "BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues", we work towards a core AI challenge: how can agents follow complex, conversational instructions in a dynamic 3D world? To this end, we introduce an enhanced task

thumb_up_off_alt16

chat_bubble_outline1

repeat6

shareShare

Yiling Lou

@yiling__lou

a month ago

Thrilled to announce that I'll be joining UIUC CS Siebel School of Computing and Data Science as an Assistant Professor in Spring 2026! 📢 I’m looking for Fall '26 PhD students who are interested in the intersection of Software Engineering and AI, especially in LLM4Code and Code Agents. Please drop me an

thumb_up_off_alt695

chat_bubble_outline44

repeat71

shareShare

Shizhe Diao

@shizhediao

a month ago

🚀 Introducing BroRL: Scaling Reinforcement Learning via Broadened Exploration When step-scaling hits a plateau, scale rollouts, not steps. BroRL takes reinforcement learning beyond saturation—reviving stalled models by expanding exploration with large-N rollouts. 👇 (1/n)

thumb_up_off_alt207

chat_bubble_outline19

repeat42

shareShare

Zhenhailong Wang

@zhenhailongw

a month ago

Multimodal conversational agents struggle to follow complex policies, which also impose a fixed computational cost. We ask: 👉 How can we achieve stronger policy-following behavior without having to include policies in-context? 🌐: mikewangwzhl.github.io/TriMPI/ 🧵1/3

thumb_up_off_alt37

chat_bubble_outline1

repeat12

shareShare

Manling Li

@manlingli_

a month ago

World Model Reasoning for VLM Agents (NeurIPS 2025, Score 5544) We release VAGEN to teach VLMs to build internal world models via visual state reasoning: - StateEstimation: what is the current state? - TransitionModeling: what is next? MDP → POMDP shift to handle the partial

thumb_up_off_alt298

chat_bubble_outline3

repeat66

shareShare

Stanford NLP Group

@stanfordnlp

a month ago

Today, we’re overjoyed to have a 25th Anniversary Reunion of Stanford NLP Group. So happy to see so many of our former students back at Stanford University. And thanks to Stanford HAI for the venue!

Today, we’re overjoyed to have a 25th Anniversary Reunion of <a href="/stanfordnlp/">Stanford NLP Group</a>.

So happy to see so many of our former students back at <a href="/Stanford/">Stanford University</a>.

And thanks to <a href="/StanfordHAI/">Stanford HAI</a> for the venue!

thumb_up_off_alt316

chat_bubble_outline9

repeat42

shareShare

Diyi Yang

@diyi_yang

24 days ago

Zora Wang 's research compares AI agents vs humans across real work tasks (data analysis, engineering, design, writing). Key findings: 👉Agents are 88% faster & 90-96% cheaper 👉BUT produce lower quality work, often fabricate data to mask limitations 👉Agents code everything,

<a href="/ZhiruoW/">Zora Wang</a> 's research compares AI agents vs humans across real work tasks (data analysis, engineering, design, writing). Key findings:

👉Agents are 88% faster & 90-96% cheaper
👉BUT produce lower quality work, often fabricate data to mask limitations
👉Agents code everything,

thumb_up_off_alt107

chat_bubble_outline1

repeat15

shareShare

Haoyi Qiu

@haoyiqiu

24 days ago

🤖💬AI agents can be easily persuaded (like Anthropic’s Claudius often giving discounts). 🤔Previous study on persuasion has been exclusively on text-only modality. We wonder: are AI agents more susceptible when presented with multimodal content? Introducing MMPersuade, a

thumb_up_off_alt130

chat_bubble_outline11

repeat26

shareShare

Heng Ji

@hengjinlp

19 days ago

Many of my students cannot attend EMNLP in person due to visa problems, but the super rising star Cheng Qian Cheng Qian @ EMNLP2025 will be there presenting multiple papers. Please drop by our posters and talk to him!

thumb_up_off_alt94

chat_bubble_outline0

repeat8

shareShare

Alexi Gladstone

@alexiglad

17 days ago

What if your policy could reason and think dynamically, especially about uncertainty, enabling better real-world behavior? ⚡️Introducing EBT-Policy, an instantiation of Energy-Based Transformers for Policies! TLDR: - EBT-Policy broadly outperforms Diffusion Policy in both

thumb_up_off_alt29

chat_bubble_outline2

repeat10

shareShare

Hyeonjeong Ha

@hyeonjeong_ai

16 days ago

🏠🤖 Are our household robotic agents actually safe? BEAT introduces the first visual backdoor attack on MLLM-based embodied agents, where a single object (e.g., 🔪 or 🏺) can silently flip a home robot from normal behavior into harmful multi-step actions. 🚨Check out our work!

thumb_up_off_alt17

chat_bubble_outline0

repeat3

shareShare

Se-woong (Sam) Lee

@sewoong_sam_lee

15 days ago

Had a great time at the UIUC NLP large group on “Sparse Autoencoders: Discoveries, Limitations, and the Bridge between Experiment and Theory” Thanks to Prof. Heng Ji and Jeonghwan Kim for kindly hosting, and to everyone for the engaging discussion! 🔗arxiv.org/pdf/2503.24277

Had a great time at the <a href="/uiuc_nlp/">UIUC NLP</a> large group on “Sparse Autoencoders: Discoveries, Limitations, and the Bridge between Experiment and Theory”

Thanks to Prof. <a href="/hengjinlp/">Heng Ji</a> and <a href="/MasterJeongK/">Jeonghwan Kim</a> for kindly hosting, and to everyone for the engaging discussion!

🔗arxiv.org/pdf/2503.24277

thumb_up_off_alt9

chat_bubble_outline1

repeat2

shareShare