Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile
Jiaxin Huang

@jiaxinhuang0229

Assistant professor @WUSTL CSE.
LLM, NLP, ML, Data Mining. PhD from @IllinoisCS. Microsoft Research PhD Fellow.

ID: 1532511461133541376

linkhttps://teapot123.github.io/ calendar_today02-06-2022 23:56:32

23 Tweet

438 Followers

76 Following

Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

🎓 Just passed my PhD Thesis Defense! 🎉 Super lucky to have Prof. Jiawei Han as my advisor, and huge thanks to my Thesis Committee members Prof. Chengxiang Zhai, Prof. Tarek Abdelzaher, and Dr. Jianfeng Gao. Shoutout to my awesome co-authors for their unwavering support!

🎓 Just passed my PhD Thesis Defense! 🎉 Super lucky to have Prof. Jiawei Han as my advisor, and huge thanks to my Thesis Committee members Prof. Chengxiang Zhai, Prof. Tarek Abdelzaher, and Dr. <a href="/JianfengGao0217/">Jianfeng Gao</a>. Shoutout to my awesome co-authors for their unwavering support!
Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

Curious about efficient many-shot ICL with LLMs? Our new paper led by ChengSong Huang introduces LARA that divides & reweights in-context examples to ensure ✅ Better Performance ✅ Improved Scalability ✅ No need to Access Model Parameters ✅ Less Memory Usage

Banghua Zhu (@banghuaz) 's Twitter Profile Photo

🔍 Which reward model characteristics best predict RLHF performance? We evaluated RMs & LLM-judges on: - Human preference agreement on Chatbot Arena - Accuracy in selecting correct code/math answers - Correlation with Chatbot Arena rankings Interesting finding: Lower-bound

Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

🤨Ever wonder why RLHF-trained LLMs are overconfident? 🚀Check out our new work led by Jixuan Leng, revealing that reward models themselves are biased towards high-confidence responses!😯 🥳We introduce two practical solutions (PPO-M & PPO-C) to improve language model

Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

🚀 Exciting opportunity for LLM multi-agent researchers at the Agent Society Challenge at WWW 2025! Monetary prizes are $12,000 in total and top teams will be recommended to publish their results in the WWW Companion proceedings🥳 More details can be found here:

🚀 Exciting opportunity for LLM multi-agent researchers at the Agent Society Challenge at WWW 2025! Monetary prizes are $12,000 in total and top teams will be recommended to publish their results in the WWW Companion proceedings🥳 More details can be found here:
Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

Thrilled to share our recent work "Efficient Test-Time Scaling via Self-Calibration"! We introduce a smart way to boost LLM efficiency in test-time scaling without sacrificing accuracy🧠! By using self-calibrated confidence scores, we enable early stopping in Best-of-N and

Bill Doerrfeld (@doerrfeldbill) 's Twitter Profile Photo

LLMs can now trace their outputs to their training data. 🤯 I cover the implications of Ai2's new OLMoTrace feature on The New Stack today. thenewstack.io/llms-can-now-t…

Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

Can LVLMs solve crossword puzzles? Our evaluation of over 20 LLMs and LVLMs finds that LVLMs largely lag behind LLMs due to poor vertical word extraction. Reasoning LLMs (like o3-mini) outperform non-reasoning models, benefitting from cross-letter constraints!

Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

🚀I’ll be at #ICLR2025! Our group is presenting: Apr 25: Reward Calibration in RLHF Apr 26: Generative Joint Graph Language Modeling Apr 27/28: Logit Arithmetic Approach for In-Context Learning (SLLM, Reasoning & Planning Workshop) 😆 Let’s chat about LLM research, PhD

🚀I’ll be at #ICLR2025! Our group is presenting:

Apr 25: Reward Calibration in RLHF
Apr 26: Generative Joint Graph Language Modeling
Apr 27/28: Logit Arithmetic Approach for In-Context Learning (SLLM, Reasoning &amp; Planning Workshop)

😆 Let’s chat about LLM research, PhD
Bowen Jin (@bowenjin13) 's Twitter Profile Photo

Sorry to miss ICLR this year — but if you're interested in the 𝐥𝐨𝐧𝐠-𝐜𝐨𝐧𝐭𝐞𝐱𝐭 𝐋𝐋𝐌 𝐯𝐬. 𝐑𝐀𝐆 𝐝𝐞𝐛𝐚𝐭𝐞, don’t miss our poster! My amazing collaborator from Google will be there to chat and share insights. 📍 Hall 3 + Hall 2B #302 🕒 Thu, Apr 24 | 3:00–5:30

Ming Zhong (@mingzhong_) 's Twitter Profile Photo

I will be presenting our poster for the “Law of the Weakest Link” paper at ICLR today! If you're interested in this topic, feel free to stop by and chat! 📍 Location: Hall 3 + Hall 2B #257 ⏰ Time: Apr 25 | 10:00 AM – 12:30 PM SGT

I will be presenting our poster for the “Law of the Weakest Link” paper at ICLR today!  If you're interested in this topic, feel free to stop by and chat!

📍 Location: Hall 3 + Hall 2B #257 
⏰ Time: Apr 25 | 10:00 AM – 12:30 PM SGT
Bill Yuchen Lin (@billyuchenlin) 's Twitter Profile Photo

Our paper was accepted by ICML Conference 2025! If you're working on RL for reasoning, consider adding more logical puzzle data to your training and eval. Share your ideas for logical reasoning tasks for ZebraLogic v2 and interesting RL studies you want to see! Many thanks to my

Yu Meng @ ICLR'25 (@yumeng0818) 's Twitter Profile Photo

Thrilled to be named to the Forbes 30 Under 30 Asia 2025 list! 🤩 Excited to keep pushing the boundaries of LLMs to tackle real-world challenges 🙌

Thrilled to be named to the Forbes 30 Under 30 Asia 2025 list! 🤩 Excited to keep pushing the boundaries of LLMs to tackle real-world challenges 🙌
Siru Ouyang (@siru_ouyang) 's Twitter Profile Photo

🚀 Introducing RAST: Reasoning Activation via Small Model Transfer! ✨ RAST adjusts key "reasoning tokens" at decoding time using insights from smaller RL-tuned models — no full RL tuning for large models! ⚡ Efficient & Performant,🧠 Scalable & Easy,📉 Up to 50% less GPU memory!

🚀 Introducing RAST: Reasoning Activation via Small Model Transfer!
✨ RAST adjusts key "reasoning tokens" at decoding time using insights from smaller RL-tuned models — no full RL tuning for large models!
⚡ Efficient &amp; Performant,🧠 Scalable &amp; Easy,📉 Up to 50% less GPU memory!
Jiacheng Liu (@liujc1998) 's Twitter Profile Photo

We enabled OLMoTrace for Tülu 3 models! 🤠 Matched spans are shorter than for OLMo models, bc we can only search in Tülu's post-training data (base model is Llama). Yet we thought it'd still bring some value. Try yourself on the Ai2 playground -- playground.allenai.org

We enabled OLMoTrace for Tülu 3 models! 🤠

Matched spans are shorter than for OLMo models, bc we can only search in Tülu's post-training data (base model is Llama). Yet we thought it'd still bring some value.

Try yourself on the Ai2 playground -- playground.allenai.org
Yu Meng @ ICLR'25 (@yumeng0818) 's Twitter Profile Photo

Excited to share our #ICML25 paper (led by Zhepei Wei) on accelerating LLM decoding! ⚡️ AdaDecode predicts tokens early from intermediate layers 🙅‍♂️No drafter model needed 🪶Just lightweight LM heads ✨Output consistency with standard autoregressive decoding Thread👇

Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

🚀🚀Excited to share our new work on Speculative Decoding by Langlin Huang! We tackle a key limitation in draft models which predict worse tokens at later positions, and present PosS that generates high-quality drafts!

Jiaxin Huang (@jiaxinhuang0229) 's Twitter Profile Photo

Thrilled to share this exciting work, R-Zero, from my student ChengSong Huang where LLM learns to reason from Zero human-curated data! The framework includes co-evolution of a "Challenger" to propose difficult tasks and a "Solver" to solve them. Check out more details in the