Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile
Hyeonjeong Ha

@hyeonjeong_ai

Ph.D. student @IllinoisCS @UIUC_NLP | Previously @KAIST @kaist_ai

ID: 1637364968004923392

linkhttps://hyeonjeongha.github.io/ calendar_today19-03-2023 08:06:53

97 Tweet

300 Followers

336 Following

Chi Han (@glaciohound) 's Twitter Profile Photo

Welcome to my #AAAI2025 Tutorial, "The Quest for A Science of LMs," today! Time: Feb 26, 2pm-3:45pm Location: Room 113A, Pennsylvania Convention Center Website: glaciohound.github.io/Science-of-LLMโ€ฆ Underline: underline.io/events/487/schโ€ฆ

Jeonghwan Kim (@masterjeongk) 's Twitter Profile Photo

SearchDet was accepted to #CVPR2025 ๐ŸŽ‰ We retrieve images from the Web and generate heatmaps through simple feature subtraction to improve long-tail object detection ๐Ÿ‘

Wenhu Chen (@wenhuchen) 's Twitter Profile Photo

We have made a huge progress in language model reasoning. But our progress in multimodal reasoning (like MMMU) is very limited. Why? It's due to the lack of diverse, difficult and high-quality multimodal reasoning dataset! ๐Ÿš€ New Paper Alert! ๐Ÿ“ข We introduce VisualWebInstruct,

We have made a huge progress in language model reasoning. But our progress in multimodal reasoning (like MMMU) is very limited. 
Why? It's due to the lack of diverse, difficult and high-quality multimodal reasoning dataset!

๐Ÿš€ New Paper Alert! ๐Ÿ“ข
We introduce VisualWebInstruct,
Zhenhailong Wang (@zhenhailongw) 's Twitter Profile Photo

Why allocate the same number of visual tokens to a blank image and a complex landscape? Introducing DyMU: a training-free algorithm that makes any ViT visual encoder dynamic-length and plug-and-play with downstream VLMs. ๐Ÿš€ ๐Ÿ”— Project Page: mikewangwzhl.github.io/dymu/

Why allocate the same number of visual tokens to a blank image and a complex landscape? Introducing DyMU: a training-free algorithm that makes any ViT visual encoder dynamic-length and plug-and-play with downstream VLMs. ๐Ÿš€
๐Ÿ”— Project Page: mikewangwzhl.github.io/dymu/
Vercept (@vercept_ai) 's Twitter Profile Photo

Today we're excited to introduce Vy, our AI that sees and acts on your computer. At Vercept, our mission is to reinvent how humans use computersโ€“enabling you to accomplish orders of magnitude more than what you can do today. Vy is a first glimpse at AI that sees and uses your

Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

๐Ÿš€ Computational persuasion of LLMs can be a game-changerโ€”dive into our new survey to explore the taxonomy, spot the risks, and investigate further challenges in persuasive LLMs!

Salesforce AI Research (@sfresearch) 's Twitter Profile Photo

We're thrilled to announce BLIP3-o, a breakthrough in unified multimodal models that excels at both image understanding and generation in a single autoregressive architecture! ๐Ÿ’ซ ๐Ÿ“Š Paper: bit.ly/3Saybpo ๐Ÿค— Models: bit.ly/4jhFaYM ๐Ÿง  Code:

We're thrilled to announce BLIP3-o, a breakthrough in unified multimodal models that excels at both image understanding and generation in a single autoregressive architecture! ๐Ÿ’ซ

๐Ÿ“Š Paper: bit.ly/3Saybpo
๐Ÿค— Models: bit.ly/4jhFaYM
๐Ÿง  Code:
Yangyi Chen (on job market) (@yangyichen6666) 's Twitter Profile Photo

๐Ÿ‚๐ŸบIntroducing our recent preprint: Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training! We present PRIOR, a simple vision-language pre-training algorithm that addresses the challenge of irrelevant textual content in image-caption pairs. PRIOR enhances

๐Ÿ‚๐ŸบIntroducing our recent preprint: Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training! 

We present PRIOR, a simple vision-language pre-training algorithm that addresses the challenge of irrelevant textual content in image-caption pairs. PRIOR enhances
Heng Ji (@hengjinlp) 's Twitter Profile Photo

We are extremely excited to announce mCLM, a Modular Chemical Language Model that is friendly to automatable block-based chemistry and mimics bilingual speakers by โ€œcode-switchingโ€ between functional molecular modules and natural language descriptions of the functions. 1/2

We are extremely excited to announce mCLM, a Modular Chemical Language Model that is friendly to automatable block-based chemistry and mimics bilingual speakers by โ€œcode-switchingโ€ between functional molecular modules and natural language descriptions of the functions. 1/2
Yi Xu (@_yixu) 's Twitter Profile Photo

๐Ÿš€Letโ€™s Think Only with Images. No language and No verbal thought.๐Ÿค” Letโ€™s think through a sequence of images๐Ÿ’ญ, like how humans picture steps in their minds๐ŸŽจ. We propose Visual Planning, a novel reasoning paradigm that enables models to reason purely through images.

๐Ÿš€Letโ€™s Think Only with Images.

No language and No verbal thought.๐Ÿค” 

Letโ€™s think through a sequence of images๐Ÿ’ญ, like how humans picture steps in their minds๐ŸŽจ. 

We propose Visual Planning, a novel reasoning paradigm that enables models to reason purely through images.
Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Thrilled to share that our paper has been accepted to #ACL2025 Main ๐Ÿ‡ฆ๐Ÿ‡น Huge thanks to my amazing collaborators and my advisor Heng Ji ๐Ÿ™ƒ ๐Ÿ“„arxiv.org/abs/2502.17793 Happy to chat about our work as well as MLLM research projects ๐Ÿ™Œ

Thrilled to share that our paper has been accepted to #ACL2025 Main ๐Ÿ‡ฆ๐Ÿ‡น

Huge thanks to my amazing collaborators and my advisor <a href="/hengjinlp/">Heng Ji</a> ๐Ÿ™ƒ 
๐Ÿ“„arxiv.org/abs/2502.17793

Happy to chat about our work as well as MLLM research projects ๐Ÿ™Œ
Stella Li (@stellalisy) 's Twitter Profile Photo

๐Ÿคฏ We cracked RLVR with... Random Rewards?! Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by: - Random rewards: +21% - Incorrect rewards: +25% - (FYI) Ground-truth rewards: + 28.8% How could this even workโ‰๏ธ Here's why: ๐Ÿงต Blogpost: tinyurl.com/spurious-rewarโ€ฆ

๐Ÿคฏ We cracked RLVR with... Random Rewards?!
Training Qwen2.5-Math-7B with our Spurious Rewards improved MATH-500 by:
- Random rewards: +21%
- Incorrect rewards: +25%
- (FYI) Ground-truth rewards: + 28.8%
How could this even workโ‰๏ธ Here's why: ๐Ÿงต
Blogpost: tinyurl.com/spurious-rewarโ€ฆ
Cheng Qian (@qiancheng1231) 's Twitter Profile Photo

๐Ÿ“ข New Paper Drop: From Solving to Modeling! LLMs can solve math problems โ€” but can they model the real world? ๐ŸŒ ๐Ÿ“„ arXiv: arxiv.org/pdf/2505.15068 ๐Ÿ’ป Code: github.com/qiancheng0/Modโ€ฆ Introducing ModelingAgent, a breakthrough system for real-world mathematical modeling with LLMs.

๐Ÿ“ข New Paper Drop: From Solving to Modeling!
LLMs can solve math problems โ€” but can they model the real world? ๐ŸŒ

๐Ÿ“„ arXiv: arxiv.org/pdf/2505.15068
๐Ÿ’ป Code: github.com/qiancheng0/Modโ€ฆ

Introducing ModelingAgent, a breakthrough system for real-world mathematical modeling with LLMs.
May Fung (@may_f1_) 's Twitter Profile Photo

๐Ÿง  How can AI evolve from statically ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ข๐˜ฃ๐˜ฐ๐˜ถ๐˜ต ๐˜ช๐˜ฎ๐˜ข๐˜จ๐˜ฆ๐˜ด โ†’ dynamically ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ธ๐˜ช๐˜ต๐˜ฉ ๐˜ช๐˜ฎ๐˜ข๐˜จ๐˜ฆ๐˜ด as cognitive workspaces, similar to the human mental sketchpad? ๐Ÿ” Whatโ€™s the ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต ๐—ฟ๐—ผ๐—ฎ๐—ฑ๐—บ๐—ฎ๐—ฝ from tool-use โ†’ programmatic

๐Ÿง  How can AI evolve from statically ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ข๐˜ฃ๐˜ฐ๐˜ถ๐˜ต ๐˜ช๐˜ฎ๐˜ข๐˜จ๐˜ฆ๐˜ด โ†’ dynamically ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ธ๐˜ช๐˜ต๐˜ฉ ๐˜ช๐˜ฎ๐˜ข๐˜จ๐˜ฆ๐˜ด as cognitive workspaces, similar to the human mental sketchpad?
๐Ÿ” Whatโ€™s the ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต ๐—ฟ๐—ผ๐—ฎ๐—ฑ๐—บ๐—ฎ๐—ฝ from tool-use โ†’ programmatic
Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Excited to share our work on Energy-Based Transformers, led by my amazing labmate Alexi Gladstoneโ€”a new frontier in unlocking generalized reasoning across modalities without rewards. Grateful to be part of this journey! โšก๏ธ ๐Ÿง  Think longer. Verify better. Generalize further.

Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

๐Ÿš€ Excited to share our work led by my amazing labmate Zhenhailong Wang, PAPO: Perception-Aware Policy Optimization, an extension of GRPO for multimodal reasoning! No extra labels. No reward models. Just internal supervision. ๐Ÿ”ฅ Learning to perceive while learning to reason.

Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Excited to be presenting on Monday, 7/28 from 11:00amโ€“12:30pm at Hall 4/5 in ACL! If youโ€™re interested in MLLM research, Iโ€™d love to chatโ€”come say hi!๐Ÿ‡ฆ๐Ÿ‡น๐Ÿ‘‹

Hyeonjeong Ha (@hyeonjeong_ai) 's Twitter Profile Photo

Thrilled to share our NeurIPS 2025 Spotlight ๐ŸŽ‰ Check out our PARTONOMY paper! Led by my amazing labmates Jeonghwan Kim and Ansel, we introduce: PARTONOMY Benchmark and PLUM Model for part-level visual understanding and grounding๐Ÿ”ฅ

Cheng Qian (@qiancheng1231) 's Twitter Profile Photo

๐Ÿš€ Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores. ๐Ÿ“„ Paper: arxiv.org/pdf/2509.19736 ๐Ÿ’ป Code: github.com/SalesforceAIReโ€ฆ

๐Ÿš€ Introducing UserRL: a new framework to train agents that truly assist users through proactive interaction, not just chase static benchmarking scores.

 ๐Ÿ“„ Paper: arxiv.org/pdf/2509.19736
 ๐Ÿ’ป Code: github.com/SalesforceAIReโ€ฆ