Ziteng Sun (@sziteng) Twitter Tweets • TwiCopy

Ziteng Sun

@sziteng

+ Follow

Responsible and efficient AI.
Topics: LLM efficiency; LLM alignment; Differential Privacy; Information Theory. Research Scientist @Google; PhD @Cornell

ID: 3020905377

linkhttp://zitengsun.com calendar_today06-02-2015 03:04:03

67 Tweet

428 Followers

388 Following

Ziteng Sun

@sziteng

10 months ago

Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. The standard RLHF framework focuses only on improving the trained model. This creates a train/inference mismatch. Can we align our model to better suit a given inference-time

thumb_up_off_alt250

chat_bubble_outline5

repeat51

shareShare

Beidi Chen

@beidichen

9 months ago

⏰📢After years of working on long-context efficiency, I’ve started to doubt if it’s truly necessary (Many of you have probably noticed the decline of interest in long llms). Despite strong models like Gemini, short-context + retrieval often do the trick—faster, cheaper, and

thumb_up_off_alt456

chat_bubble_outline20

repeat95

shareShare

Hongyang Zhang

@hongyangzh

8 months ago

Jointly announcing EAGLE-3 with SGLang: Setting a new record in LLM inference acceleration! - 5x🚀than vanilla (on HF) - 1.4x🚀than EAGLE-2 (on HF) - A record of ~400 TPS on LLama 3.1 8B with a single H100 (on SGLang) - 1.65x🚀in latency even for large bs=64 (on SGLang) - A new

thumb_up_off_alt302

chat_bubble_outline14

repeat42

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

7 months ago

Today at 10am I will present Ziteng Sun's paper "block verification accelerates speculative decoding"

Today at 10am I will present <a href="/SZiteng/">Ziteng Sun</a>'s paper
"block verification accelerates speculative decoding"

thumb_up_off_alt53

chat_bubble_outline2

repeat3

shareShare

Nived Rajaraman

@nived_rajaraman

7 months ago

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025! 📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models! │ 🗓️ Deadline: May 19, 2025

thumb_up_off_alt74

chat_bubble_outline1

repeat24

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

4 months ago

Happening now at poster E-2804. Come talk to us about why reward calibration key is to alignment and how to do RLHF for test-time scaling

thumb_up_off_alt20

chat_bubble_outline1

repeat2

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

4 months ago

The main ingredient that led to GRPO's performance leap is the calibration of the reward/value via multiple rollouts per prompt. Let me elaborate on what I mean by that and a cheaper way of doing it offline.

thumb_up_off_alt605

chat_bubble_outline9

repeat48

shareShare