Huaqing Zhang (@zhqwqwq) 's Twitter Profile
Huaqing Zhang

@zhqwqwq

third year undergraduate in Yao class, Tsinghua University

ID: 1774406709341224960

calendar_today31-03-2024 12:01:48

4 Tweet

24 Takipçi

83 Takip Edilen

Huaqing Zhang (@zhqwqwq) 's Twitter Profile Photo

Excited to present our work on the optimization analysis of Chain-of-Thought at #ICLR2025! Come meet us this morning! Poster Presentation: 📆Saturday, April 26 🕰️10:00 PM – 12:30 PM CST ✨Hall 3 + Hall 2B, Poster #554

Zixuan Wang (@zzzixuanwang) 's Twitter Profile Photo

LLMs can solve complex tasks that require combining multiple reasoning steps. But when are such capabilities learnable via gradient-based training? In our new COLT 2025 paper, we show that easy-to-hard data is necessary and sufficient! arxiv.org/abs/2505.23683 🧵 below (1/10)

LLMs can solve complex tasks that require combining multiple reasoning steps. But when are such capabilities learnable via gradient-based training?

In our new COLT 2025 paper, we show that easy-to-hard data is necessary and sufficient!

arxiv.org/abs/2505.23683

🧵 below (1/10)
Zhiyuan Li (@zhiyuanli_) 's Twitter Profile Photo

Adaptive optimizers range from AdaGrad-Norm to Shampoo and full-matrix AdaGrad, with increasingly expressive preconditioners. But does more adaptivity always translate to fewer steps to converge? Our ICML 2025 paper answers negatively via a unified convergence analysis. 🧵1/6

Adaptive optimizers range from AdaGrad-Norm to Shampoo and full-matrix AdaGrad, with increasingly expressive preconditioners.

But does more adaptivity always translate to fewer steps to converge?

Our ICML 2025 paper answers negatively via a unified convergence analysis. 🧵1/6
Zhi Su (@zhisu22) 's Twitter Profile Photo

🏓🤖 Our humanoid robot can now rally over 100 consecutive shots against a human in real table tennis — fully autonomous, sub-second reaction, human-like strikes.