Hongru Wang (@wangcarrey) 's Twitter Profile
Hongru Wang

@wangcarrey

Visiting @uiuc_nlp Ph.D. Candidate @CUHKofficial, Prev @EdinburghNLP

ID: 1443501119083220997

linkhttp://rulegreen.github.io calendar_today30-09-2021 09:01:08

161 Tweet

222 Followers

736 Following

Shizhe Diao (@shizhediao) 's Twitter Profile Photo

🚀 How far can RL scaling take LLMs? Drop ProRLv2! 🔥With ProRLv2, we keep expanding LLM’s reasoning boundaries through 3,000+ RL steps over 5 domains and set a new state-of-the-art 🌟 among 1.5B reasoning models. 🔗 Full blog: research.nvidia.com/labs/lpr/prorl… 🤗Open model:

🚀 How far can RL scaling take LLMs?
Drop ProRLv2! 🔥With ProRLv2, we keep expanding LLM’s reasoning boundaries through 3,000+ RL steps over 5 domains and set a new state-of-the-art 🌟 among 1.5B reasoning models.

🔗 Full blog: research.nvidia.com/labs/lpr/prorl…
🤗Open model:
Hongru Wang (@wangcarrey) 's Twitter Profile Photo

Actually, we implemented this kind of capability in our AdaCtrl paper three months ago by injecting difficult-aware tags (i.e., easy, hard, adaptive) to trigger different reasoning behaviors of LLMs. Paper: arxiv.org/pdf/2505.18822

NIK (@ns123abc) 's Twitter Profile Photo

Google DeepMind just dropped a Nature paper: “A personal health large language model for sleep and fitness coaching.” > Gemini is better that doctors and trainers at sleep and fitness > paper shows huge benefit of AI personalization for health and long-form coaching Honestly

Google DeepMind just dropped a Nature paper: “A personal health large language model for sleep and fitness coaching.”

> Gemini is better that doctors and trainers at sleep and fitness 
> paper shows huge benefit of AI personalization for health and long-form coaching 

Honestly
Heng Ji (@hengjinlp) 's Twitter Profile Photo

Thanks so much again to the IJCAI25 organizers for the opportunity to share our work on AI+Science! I’m grateful to work with our amazing collaborators and students at Molecule Maker Lab Institute

Cheng Qian (@qiancheng1231) 's Twitter Profile Photo

📣 Our paper is accepted to Findings of EMNLP 2025! Many thanks to all the co-authors! 🌍 Math modeling is the perfect lens for agents to approach the real world challenges. Come and check how we do it: arxiv.org/pdf/2505.15068

Hongru Wang (@wangcarrey) 's Twitter Profile Photo

Congratulations to everyone! This was the *only paper* I worked on overnight during my PhD, but fortunately, I had a group of friends by my side. It is truly a remember memory.

Heng Ji (@hengjinlp) 's Twitter Profile Photo

Thanks again to the organizers! What inspired me most was Yoshua Bengio’s incredible kindness and humility. He welcomed different opinions with genuine openness. I could also see the same spirit of kindness and deep care for humanity reflected in his mentees, like Kyunghyun Cho

Jyo Pari (@jyo_pari) 's Twitter Profile Photo

For agents to improve over time, they can’t afford to forget what they’ve already mastered. We found that supervised fine-tuning forgets more than RL when training on a new task! Want to find out why? 👇

For agents to improve over time, they can’t afford to forget what they’ve already mastered.

We found that supervised fine-tuning forgets more than RL when training on a new task! 

Want to find out why? 👇
Heng Ji (@hengjinlp) 's Twitter Profile Photo

I'm hiring 1-2 new postdocs to work on AI for Scientific Discovery (especially on drug discovery and material discovery) and Science-Inspired AI (especially on scientific foundation models). Please drop me an email if you are interested or know someone who might be a great fit!

Emre Can Acikgoz (@emrecanacikgoz) 's Twitter Profile Photo

Excited to shared that ToolRL is accepted to NeurIPS Conference🎉 I was watching Denny Zhou’s Simon Institute talk live, where in one slide, he defined RL Fine-Tuning as “directly optimize what you want”. This very simple reframe completely shifted my perspective on training and the

Excited to shared that ToolRL is accepted to <a href="/NeurIPSConf/">NeurIPS Conference</a>🎉

I was watching <a href="/denny_zhou/">Denny Zhou</a>’s Simon Institute talk live, where in one slide, he defined RL Fine-Tuning as “directly optimize what you want”.

This very simple reframe completely shifted my perspective on training and the
ACLRollingReview (@reviewacl) 's Twitter Profile Photo

📢 Early submission for ACL 2026 via ARR Oct cycle is available to support authors facing potential visa delays. 📝 Early invitation letters possible for submit-ready work (not acceptance guarantees). ⚠️ Preliminary work risks rejection & Jan-cycle ineligibility. #NLProc #ARR

Yu Su @#ICLR2025 (@ysu_nlp) 's Twitter Profile Photo

> working on semantic parsing in PhD > didn't even have its own track at ACL > it's a dead area, people say > had ~100 citations when graduating > but natural language programming is always the dream > 'Let machines understand human thinking. Don’t let humans think like machines'

&gt; working on semantic parsing in PhD
&gt; didn't even have its own track at ACL
&gt; it's a dead area, people say
&gt; had ~100 citations when graduating
&gt; but natural language programming is always the dream
&gt; 'Let machines understand human thinking. Don’t let humans think like machines'
J.K. Rowling (@jk_rowling) 's Twitter Profile Photo

I'm seeing quite a bit of comment about this, so I want to make a couple of points. I'm not owed eternal agreement from any actor who once played a character I created. The idea is as ludicrous as me checking with the boss I had when I was twenty-one for what opinions I should

Zhenhailong Wang (@zhenhailongw) 's Twitter Profile Photo

Multimodal conversational agents struggle to follow complex policies, which also impose a fixed computational cost. We ask: 👉 How can we achieve stronger policy-following behavior without having to include policies in-context? 🌐: mikewangwzhl.github.io/TriMPI/ 🧵1/3

Multimodal conversational agents struggle to follow complex policies, which also impose a fixed computational cost.
We ask:
👉 How can we achieve stronger policy-following behavior without having to include policies in-context?
🌐: mikewangwzhl.github.io/TriMPI/ 🧵1/3
Mengdi Wang (@mengdiwang10) 's Twitter Profile Photo

🚀 Introducing LabOS: The AI-XR Co-Scientist A system that sees, understands, and works with humans in real-world labs. 👁️ Egocentric vision & extended reality 🧠 LLM reasoning & hypothesis generation 🤖 Real-time guidance & multi-modal human-AI collaboration From observation →

🚀 Introducing LabOS: The AI-XR Co-Scientist
A system that sees, understands, and works with humans in real-world labs.
👁️ Egocentric vision &amp; extended reality
🧠 LLM reasoning &amp; hypothesis generation
🤖 Real-time guidance &amp; multi-modal human-AI collaboration

From observation →