UIUC NLP (@uiuc_nlp) 's Twitter Profile
UIUC NLP

@uiuc_nlp

Natural Language Processing research group at The University of Illinois Urbana-Champaign @IllinoisCS @UofIllinois

ID: 1149765468930150402

calendar_today12-07-2019 19:40:24

515 Tweet

1,1K Takipçi

137 Takip Edilen

Heng Ji (@hengjinlp) 's Twitter Profile Photo

Accepted as NeurIPS2025 Spotlight! Existing large multimodal models (LMMs) have very poor visual understanding and reasoning over part-level attributes and affordances (only 5.9% gIoU). We developed novel part-centric LMMs to address these challenges arxiv.org/pdf/2505.20759

Accepted as NeurIPS2025 Spotlight! Existing large multimodal models (LMMs) have very poor visual understanding and reasoning over part-level attributes and affordances (only 5.9% gIoU). We developed novel part-centric LMMs to address these challenges arxiv.org/pdf/2505.20759
Cheng Qian (@qiancheng1231) 's Twitter Profile Photo

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores. 📄 Paper: arxiv.org/pdf/2509.19736 💻 Code: github.com/SalesforceAIRe…

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores.

 📄 Paper: arxiv.org/pdf/2509.19736
 💻 Code: github.com/SalesforceAIRe…
Alexi Gladstone (@alexiglad) 's Twitter Profile Photo

*Human-Like* Creativity is perhaps the most out-of-reach task for modern LLMs I'm super excited to share our new work evaluating LLMs with a creativity framework! We develop a synthetic creativity task to measure LLMs' capabilities in generating novel, creative, combinations,

EMNLP 2025 (@emnlpmeeting) 's Twitter Profile Photo

Please note that #EMNLP2025 volunteer notifications have been sent. If you haven’t received yours, please check your spam folder or contact the chairs at [email protected] as some email addresses were entered incorrectly in the form

Bowen Jin (@bowenjin13) 's Twitter Profile Photo

Very excited to see Tinker released by Thinking Machines! Even more thrilled that Search-R1 is featured as the tool-use application in Tinker’s recipe 👇 🔗 github.com/thinking-machi… When we first built Search-R1, we opened up everything—data, recipes, models, code, logs—and kept

Kung-Hsiang Steeve Huang (@steeve__huang) 's Twitter Profile Photo

Ever felt like your GUI agents are dragging their feet? 🧐The culprit? Crunching through endless streams of screenshots, especially in those marathon long-horizon tasks. Thrilled to unveil ⭐️ GUI-KV ⭐️— our plug-and-play powerhouse that taps into the spatial saliency within

Ever felt like your GUI agents are dragging their feet? 🧐The culprit? Crunching through endless streams of screenshots, especially in those marathon long-horizon tasks. 

Thrilled to unveil ⭐️ GUI-KV ⭐️— our plug-and-play powerhouse that taps into the spatial saliency within
Se-woong (Sam) Lee (@sewoong_sam_lee) 's Twitter Profile Photo

🚨 New paper alert at COLM 2025! 🚨 An interesting open problem for those into Sparse Autoencoders (SAEs): "Top-K activation constrains L0 (the number of non-zeros), but how do we obtain E[L0]?" This was even the very first limitation noted in Leo Gao’s recent paper. (1/8)

🚨 New paper alert at COLM 2025! 🚨
An interesting open problem for those into Sparse Autoencoders (SAEs):
"Top-K activation constrains L0 (the number of non-zeros), but how do we obtain E[L0]?"

This was even the very first limitation noted in <a href="/nabla_theta/">Leo Gao</a>’s recent paper. (1/8)
Prashant Jayannavar (@p_jayannavar) 's Twitter Profile Photo

🚨 New preprint out! 🚨 In "BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues", we work towards a core AI challenge: how can agents follow complex, conversational instructions in a dynamic 3D world? To this end, we introduce an enhanced task

🚨 New preprint out! 🚨 In "BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues", we work towards a core AI challenge: how can agents follow complex, conversational instructions in a dynamic 3D world? To this end, we introduce an enhanced task
Yiling Lou (@yiling__lou) 's Twitter Profile Photo

Thrilled to announce that I'll be joining UIUC CS Siebel School of Computing and Data Science as an Assistant Professor in Spring 2026! 📢 I’m looking for Fall '26 PhD students who are interested in the intersection of Software Engineering and AI, especially in LLM4Code and Code Agents. Please drop me an

Shizhe Diao (@shizhediao) 's Twitter Profile Photo

🚀 Introducing BroRL: Scaling Reinforcement Learning via Broadened Exploration When step-scaling hits a plateau, scale rollouts, not steps. BroRL takes reinforcement learning beyond saturation—reviving stalled models by expanding exploration with large-N rollouts. 👇 (1/n)

🚀 Introducing BroRL: Scaling Reinforcement Learning via Broadened Exploration

When step-scaling hits a plateau, scale rollouts, not steps.
BroRL takes reinforcement learning beyond saturation—reviving stalled models by expanding exploration with large-N rollouts.
👇 (1/n)
Zhenhailong Wang (@zhenhailongw) 's Twitter Profile Photo

Multimodal conversational agents struggle to follow complex policies, which also impose a fixed computational cost. We ask: 👉 How can we achieve stronger policy-following behavior without having to include policies in-context? 🌐: mikewangwzhl.github.io/TriMPI/ 🧵1/3

Multimodal conversational agents struggle to follow complex policies, which also impose a fixed computational cost.
We ask:
👉 How can we achieve stronger policy-following behavior without having to include policies in-context?
🌐: mikewangwzhl.github.io/TriMPI/ 🧵1/3
Manling Li (@manlingli_) 's Twitter Profile Photo

World Model Reasoning for VLM Agents (NeurIPS 2025, Score 5544) We release VAGEN to teach VLMs to build internal world models via visual state reasoning: - StateEstimation: what is the current state? - TransitionModeling: what is next? MDP → POMDP shift to handle the partial

Diyi Yang (@diyi_yang) 's Twitter Profile Photo

Zora Wang 's research compares AI agents vs humans across real work tasks (data analysis, engineering, design, writing). Key findings: 👉Agents are 88% faster & 90-96% cheaper 👉BUT produce lower quality work, often fabricate data to mask limitations 👉Agents code everything,

<a href="/ZhiruoW/">Zora Wang</a> 's  research compares AI agents vs humans across real work tasks (data analysis, engineering, design, writing). Key findings:

👉Agents are 88% faster &amp; 90-96% cheaper
👉BUT produce lower quality work, often fabricate data to mask limitations
👉Agents code everything,
Haoyi Qiu (@haoyiqiu) 's Twitter Profile Photo

🤖💬AI agents can be easily persuaded (like Anthropic’s Claudius often giving discounts). 🤔Previous study on persuasion has been exclusively on text-only modality. We wonder: are AI agents more susceptible when presented with multimodal content? Introducing MMPersuade, a

🤖💬AI agents can be easily persuaded (like Anthropic’s Claudius often giving discounts).

🤔Previous study on persuasion has been exclusively on text-only modality. We wonder: are AI agents more susceptible when presented with multimodal content?

Introducing MMPersuade, a
Heng Ji (@hengjinlp) 's Twitter Profile Photo

Many of my students cannot attend EMNLP in person due to visa problems, but the super rising star Cheng Qian Cheng Qian @ EMNLP2025 will be there presenting multiple papers. Please drop by our posters and talk to him!

Alexi Gladstone (@alexiglad) 's Twitter Profile Photo

What if your policy could reason and think dynamically, especially about uncertainty, enabling better real-world behavior? ⚡️Introducing EBT-Policy, an instantiation of Energy-Based Transformers for Policies! TLDR: - EBT-Policy broadly outperforms Diffusion Policy in both

What if your policy could reason and think dynamically, especially about uncertainty, enabling better real-world behavior?

⚡️Introducing EBT-Policy, an instantiation of Energy-Based Transformers for Policies!
TLDR:
- EBT-Policy broadly outperforms Diffusion Policy in both
Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

🏠🤖 Are our household robotic agents actually safe? BEAT introduces the first visual backdoor attack on MLLM-based embodied agents, where a single object (e.g., 🔪 or 🏺) can silently flip a home robot from normal behavior into harmful multi-step actions. 🚨Check out our work!

Se-woong (Sam) Lee (@sewoong_sam_lee) 's Twitter Profile Photo

Had a great time at the UIUC NLP large group on “Sparse Autoencoders: Discoveries, Limitations, and the Bridge between Experiment and Theory” Thanks to Prof. Heng Ji and Jeonghwan Kim for kindly hosting, and to everyone for the engaging discussion! 🔗arxiv.org/pdf/2503.24277

Had a great time at the <a href="/uiuc_nlp/">UIUC NLP</a> large group on “Sparse Autoencoders: Discoveries, Limitations, and the Bridge between Experiment and Theory”

Thanks to Prof. <a href="/hengjinlp/">Heng Ji</a> and <a href="/MasterJeongK/">Jeonghwan Kim</a> for kindly hosting, and to everyone for the engaging discussion!

🔗arxiv.org/pdf/2503.24277