Jiaxin Huang (@jiaxinhuang0229) Twitter Tweets • TwiCopy

Jiaxin Huang

2 years ago

🎓 Just passed my PhD Thesis Defense! 🎉 Super lucky to have Prof. Jiawei Han as my advisor, and huge thanks to my Thesis Committee members Prof. Chengxiang Zhai, Prof. Tarek Abdelzaher, and Dr. Jianfeng Gao. Shoutout to my awesome co-authors for their unwavering support!

thumb_up_off_alt57

chat_bubble_outline1

repeat2

shareShare

Jiaxin Huang

@jiaxinhuang0229

a year ago

Curious about efficient many-shot ICL with LLMs? Our new paper led by ChengSong Huang introduces LARA that divides & reweights in-context examples to ensure ✅ Better Performance ✅ Improved Scalability ✅ No need to Access Model Parameters ✅ Less Memory Usage

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Banghua Zhu

@banghuaz

a year ago

🔍 Which reward model characteristics best predict RLHF performance? We evaluated RMs & LLM-judges on: - Human preference agreement on Chatbot Arena - Accuracy in selecting correct code/math answers - Correlation with Chatbot Arena rankings Interesting finding: Lower-bound

thumb_up_off_alt35

chat_bubble_outline1

repeat3

shareShare

Jiaxin Huang

@jiaxinhuang0229

a year ago

🤨Ever wonder why RLHF-trained LLMs are overconfident? 🚀Check out our new work led by Jixuan Leng, revealing that reward models themselves are biased towards high-confidence responses!😯 🥳We introduce two practical solutions (PPO-M & PPO-C) to improve language model

thumb_up_off_alt27

chat_bubble_outline0

repeat2

shareShare

Jiaxin Huang

@jiaxinhuang0229

10 months ago

🚀 Exciting opportunity for LLM multi-agent researchers at the Agent Society Challenge at WWW 2025! Monetary prizes are $12,000 in total and top teams will be recommended to publish their results in the WWW Companion proceedings🥳 More details can be found here:

thumb_up_off_alt14

chat_bubble_outline0

repeat1

shareShare

Jiaxin Huang

@jiaxinhuang0229

8 months ago

Thrilled to share our recent work "Efficient Test-Time Scaling via Self-Calibration"! We introduce a smart way to boost LLM efficiency in test-time scaling without sacrificing accuracy🧠! By using self-calibrated confidence scores, we enable early stopping in Best-of-N and

thumb_up_off_alt13

chat_bubble_outline0

repeat2

shareShare

Bill Doerrfeld

@doerrfeldbill

7 months ago

LLMs can now trace their outputs to their training data. 🤯 I cover the implications of Ai2's new OLMoTrace feature on The New Stack today. thenewstack.io/llms-can-now-t…

thumb_up_off_alt38

chat_bubble_outline3

repeat10

shareShare

Jiaxin Huang

@jiaxinhuang0229

7 months ago

Can LVLMs solve crossword puzzles? Our evaluation of over 20 LLMs and LVLMs finds that LVLMs largely lag behind LLMs due to poor vertical word extraction. Reasoning LLMs (like o3-mini) outperform non-reasoning models, benefitting from cross-letter constraints!

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Jiaxin Huang

@jiaxinhuang0229

7 months ago

🚀I’ll be at #ICLR2025! Our group is presenting: Apr 25: Reward Calibration in RLHF Apr 26: Generative Joint Graph Language Modeling Apr 27/28: Logit Arithmetic Approach for In-Context Learning (SLLM, Reasoning & Planning Workshop) 😆 Let’s chat about LLM research, PhD

thumb_up_off_alt66

chat_bubble_outline0

repeat11

shareShare

Bowen Jin

@bowenjin13

6 months ago

Sorry to miss ICLR this year — but if you're interested in the 𝐥𝐨𝐧𝐠-𝐜𝐨𝐧𝐭𝐞𝐱𝐭 𝐋𝐋𝐌 𝐯𝐬. 𝐑𝐀𝐆 𝐝𝐞𝐛𝐚𝐭𝐞, don’t miss our poster! My amazing collaborator from Google will be there to chat and share insights. 📍 Hall 3 + Hall 2B #302 🕒 Thu, Apr 24 | 3:00–5:30

thumb_up_off_alt89

chat_bubble_outline1

repeat20

shareShare

Ming Zhong

@mingzhong_

6 months ago

I will be presenting our poster for the “Law of the Weakest Link” paper at ICLR today! If you're interested in this topic, feel free to stop by and chat! 📍 Location: Hall 3 + Hall 2B #257 ⏰ Time: Apr 25 | 10:00 AM – 12:30 PM SGT

thumb_up_off_alt35

chat_bubble_outline0

repeat7

shareShare

Bill Yuchen Lin

@billyuchenlin

6 months ago

Our paper was accepted by ICML Conference 2025! If you're working on RL for reasoning, consider adding more logical puzzle data to your training and eval. Share your ideas for logical reasoning tasks for ZebraLogic v2 and interesting RL studies you want to see! Many thanks to my

thumb_up_off_alt96

chat_bubble_outline2

repeat4

shareShare

Yu Meng @ ICLR'25

@yumeng0818

6 months ago

Thrilled to be named to the Forbes 30 Under 30 Asia 2025 list! 🤩 Excited to keep pushing the boundaries of LLMs to tackle real-world challenges 🙌

thumb_up_off_alt123

chat_bubble_outline7

repeat11

shareShare

Yu Meng @ ICLR'25

@yumeng0818

5 months ago

What truly drives reasoning in RLVR? Check out our new paper led by Xinyu Zhu for some fascinating insights and analysis!! 🤩

thumb_up_off_alt27

chat_bubble_outline0

repeat3

shareShare

Siru Ouyang

@siru_ouyang

5 months ago

🚀 Introducing RAST: Reasoning Activation via Small Model Transfer! ✨ RAST adjusts key "reasoning tokens" at decoding time using insights from smaller RL-tuned models — no full RL tuning for large models! ⚡ Efficient & Performant,🧠 Scalable & Easy,📉 Up to 50% less GPU memory!

thumb_up_off_alt102

chat_bubble_outline2

repeat19

shareShare

Jiacheng Liu

@liujc1998

5 months ago

We enabled OLMoTrace for Tülu 3 models! 🤠 Matched spans are shorter than for OLMo models, bc we can only search in Tülu's post-training data (base model is Llama). Yet we thought it'd still bring some value. Try yourself on the Ai2 playground -- playground.allenai.org

thumb_up_off_alt42

chat_bubble_outline2

repeat12

shareShare

Yu Meng @ ICLR'25

@yumeng0818

5 months ago

Excited to share our #ICML25 paper (led by Zhepei Wei) on accelerating LLM decoding! ⚡️ AdaDecode predicts tokens early from intermediate layers 🙅‍♂️No drafter model needed 🪶Just lightweight LM heads ✨Output consistency with standard autoregressive decoding Thread👇

thumb_up_off_alt35

chat_bubble_outline1

repeat5

shareShare

Jiaxin Huang

@jiaxinhuang0229

5 months ago

🚀🚀Excited to share our new work on Speculative Decoding by Langlin Huang! We tackle a key limitation in draft models which predict worse tokens at later positions, and present PosS that generates high-quality drafts!

thumb_up_off_alt10

chat_bubble_outline1

repeat3

shareShare

Xiaotao Gu

@xiaotaogu

4 months ago

We Z.ai are thrilled to open-source GLM-4.1V-9B-Thinking, a VLM that can think with long CoTs. SoTA in <10B VLMs, comparable to Qwen-2.5-VL-72B in 18 tasks. One RL to rule them all! Details - Tech report: arxiv.org/abs/2507.01006 - Code: github.com/THUDM/GLM-4.1V…

We <a href="/Zai_org/">Z.ai</a> are thrilled to open-source GLM-4.1V-9B-Thinking, a VLM that can think with long CoTs. SoTA in <10B VLMs, comparable to Qwen-2.5-VL-72B in 18 tasks. One RL to rule them all!

Details
- Tech report: arxiv.org/abs/2507.01006
- Code: github.com/THUDM/GLM-4.1V…

thumb_up_off_alt27

chat_bubble_outline3

repeat9

shareShare

Jiaxin Huang

@jiaxinhuang0229

3 months ago

Thrilled to share this exciting work, R-Zero, from my student ChengSong Huang where LLM learns to reason from Zero human-curated data! The framework includes co-evolution of a "Challenger" to propose difficult tasks and a "Solver" to solve them. Check out more details in the

thumb_up_off_alt24

chat_bubble_outline1

repeat4

shareShare