Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile
Hyeonjeong Ha

@hyeonjeong_ai

Ph.D. student @IllinoisCS @UIUC_NLP | Previously @KAIST @kaist_ai

ID: 1637364968004923392

linkhttps://hyeonjeongha.github.io/ calendar_today19-03-2023 08:06:53

97 Tweet

300 Takipçi

336 Takip Edilen

Chi Han (@glaciohound) 's Twitter Profile Photo

Welcome to my #AAAI2025 Tutorial, "The Quest for A Science of LMs," today! Time: Feb 26, 2pm-3:45pm Location: Room 113A, Pennsylvania Convention Center Website: glaciohound.github.io/Science-of-LLM… Underline: underline.io/events/487/sch…

Jeonghwan Kim (@masterjeongk) 's Twitter Profile Photo

SearchDet was accepted to #CVPR2025 🎉 We retrieve images from the Web and generate heatmaps through simple feature subtraction to improve long-tail object detection 👁

Wenhu Chen (@wenhuchen) 's Twitter Profile Photo

We have made a huge progress in language model reasoning. But our progress in multimodal reasoning (like MMMU) is very limited. Why? It's due to the lack of diverse, difficult and high-quality multimodal reasoning dataset! 🚀 New Paper Alert! 📢 We introduce VisualWebInstruct,

We have made a huge progress in language model reasoning. But our progress in multimodal reasoning (like MMMU) is very limited. 
Why? It's due to the lack of diverse, difficult and high-quality multimodal reasoning dataset!

🚀 New Paper Alert! 📢
We introduce VisualWebInstruct,
Zhenhailong Wang (@zhenhailongw) 's Twitter Profile Photo

Why allocate the same number of visual tokens to a blank image and a complex landscape? Introducing DyMU: a training-free algorithm that makes any ViT visual encoder dynamic-length and plug-and-play with downstream VLMs. 🚀 🔗 Project Page: mikewangwzhl.github.io/dymu/

Why allocate the same number of visual tokens to a blank image and a complex landscape? Introducing DyMU: a training-free algorithm that makes any ViT visual encoder dynamic-length and plug-and-play with downstream VLMs. 🚀
🔗 Project Page: mikewangwzhl.github.io/dymu/
Vercept (@vercept_ai) 's Twitter Profile Photo

Today we're excited to introduce Vy, our AI that sees and acts on your computer. At Vercept, our mission is to reinvent how humans use computers–enabling you to accomplish orders of magnitude more than what you can do today. Vy is a first glimpse at AI that sees and uses your

Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

🚀 Computational persuasion of LLMs can be a game-changer—dive into our new survey to explore the taxonomy, spot the risks, and investigate further challenges in persuasive LLMs!

Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

We're thrilled to announce BLIP3-o, a breakthrough in unified multimodal models that excels at both image understanding and generation in a single autoregressive architecture! 💫 📊 Paper: bit.ly/3Saybpo 🤗 Models: bit.ly/4jhFaYM 🧠 Code:

We're thrilled to announce BLIP3-o, a breakthrough in unified multimodal models that excels at both image understanding and generation in a single autoregressive architecture! 💫

📊 Paper: bit.ly/3Saybpo
🤗 Models: bit.ly/4jhFaYM
🧠 Code:
Yangyi Chen (on job market) (@yangyichen6666) 's Twitter Profile Photo

🐂🍺Introducing our recent preprint: Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training! We present PRIOR, a simple vision-language pre-training algorithm that addresses the challenge of irrelevant textual content in image-caption pairs. PRIOR enhances

🐂🍺Introducing our recent preprint: Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training! 

We present PRIOR, a simple vision-language pre-training algorithm that addresses the challenge of irrelevant textual content in image-caption pairs. PRIOR enhances
Heng Ji (@hengjinlp) 's Twitter Profile Photo

We are extremely excited to announce mCLM, a Modular Chemical Language Model that is friendly to automatable block-based chemistry and mimics bilingual speakers by “code-switching” between functional molecular modules and natural language descriptions of the functions. 1/2

We are extremely excited to announce mCLM, a Modular Chemical Language Model that is friendly to automatable block-based chemistry and mimics bilingual speakers by “code-switching” between functional molecular modules and natural language descriptions of the functions. 1/2
Yi Xu (@_yixu) 's Twitter Profile Photo

🚀Let’s Think Only with Images. No language and No verbal thought.🤔 Let’s think through a sequence of images💭, like how humans picture steps in their minds🎨. We propose Visual Planning, a novel reasoning paradigm that enables models to reason purely through images.

🚀Let’s Think Only with Images.

No language and No verbal thought.🤔 

Let’s think through a sequence of images💭, like how humans picture steps in their minds🎨. 

We propose Visual Planning, a novel reasoning paradigm that enables models to reason purely through images.
Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Thrilled to share that our paper has been accepted to #ACL2025 Main 🇦🇹 Huge thanks to my amazing collaborators and my advisor Heng Ji 🙃 📄arxiv.org/abs/2502.17793 Happy to chat about our work as well as MLLM research projects 🙌

Thrilled to share that our paper has been accepted to #ACL2025 Main 🇦🇹

Huge thanks to my amazing collaborators and my advisor <a href="/hengjinlp/">Heng Ji</a> 🙃 
📄arxiv.org/abs/2502.17793

Happy to chat about our work as well as MLLM research projects 🙌
Stella Li (@stellalisy) 's Twitter Profile Photo

🤯 We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even work⁉️ Here's why: 🧵 Blogpost: tinyurl.com/spurious-rewar…

🤯 We cracked RLVR with... Random Rewards?!
Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:
- Random rewards: +21%
- Incorrect rewards: +25%
- (FYI) Ground-truth rewards: + 28.8%
How could this even work⁉️ Here's why: 🧵
Blogpost: tinyurl.com/spurious-rewar…
Cheng Qian (@qiancheng1231) 's Twitter Profile Photo

📢 New Paper Drop: From Solving to Modeling! LLMs can solve math problems — but can they model the real world? 🌍 📄 arXiv: arxiv.org/pdf/2505.15068 💻 Code: github.com/qiancheng0/Mod… Introducing ModelingAgent, a breakthrough system for real-world mathematical modeling with LLMs.

📢 New Paper Drop: From Solving to Modeling!
LLMs can solve math problems — but can they model the real world? 🌍

📄 arXiv: arxiv.org/pdf/2505.15068
💻 Code: github.com/qiancheng0/Mod…

Introducing ModelingAgent, a breakthrough system for real-world mathematical modeling with LLMs.
May Fung (@may_f1_) 's Twitter Profile Photo

🧠 How can AI evolve from statically 𝘵𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘢𝘣𝘰𝘶𝘵 𝘪𝘮𝘢𝘨𝘦𝘴 → dynamically 𝘵𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘸𝘪𝘵𝘩 𝘪𝘮𝘢𝘨𝘦𝘴 as cognitive workspaces, similar to the human mental sketchpad? 🔍 What’s the 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗿𝗼𝗮𝗱𝗺𝗮𝗽 from tool-use → programmatic

🧠 How can AI evolve from statically 𝘵𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘢𝘣𝘰𝘶𝘵 𝘪𝘮𝘢𝘨𝘦𝘴 → dynamically 𝘵𝘩𝘪𝘯𝘬𝘪𝘯𝘨 𝘸𝘪𝘵𝘩 𝘪𝘮𝘢𝘨𝘦𝘴 as cognitive workspaces, similar to the human mental sketchpad?
🔍 What’s the 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗿𝗼𝗮𝗱𝗺𝗮𝗽 from tool-use → programmatic
Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Excited to share our work on Energy-Based Transformers, led by my amazing labmate Alexi Gladstone—a new frontier in unlocking generalized reasoning across modalities without rewards. Grateful to be part of this journey! ⚡️ 🧠 Think longer. Verify better. Generalize further.

Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

🚀 Excited to share our work led by my amazing labmate Zhenhailong Wang, PAPO: Perception-Aware Policy Optimization, an extension of GRPO for multimodal reasoning! No extra labels. No reward models. Just internal supervision. 🔥 Learning to perceive while learning to reason.

Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Excited to be presenting on Monday, 7/28 from 11:00am–12:30pm at Hall 4/5 in ACL! If you’re interested in MLLM research, I’d love to chat—come say hi!🇦🇹👋

Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Thrilled to share our NeurIPS 2025 Spotlight 🎉 Check out our PARTONOMY paper! Led by my amazing labmates Jeonghwan Kim and Ansel, we introduce: PARTONOMY Benchmark and PLUM Model for part-level visual understanding and grounding🔥

Cheng Qian (@qiancheng1231) 's Twitter Profile Photo

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores. 📄 Paper: arxiv.org/pdf/2509.19736 💻 Code: github.com/SalesforceAIRe…

🚀 Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores.

 📄 Paper: arxiv.org/pdf/2509.19736
 💻 Code: github.com/SalesforceAIRe…