Haoyu Zhao (@thomaszhao1998) 's Twitter Profile
Haoyu Zhao

@thomaszhao1998

PhD student @Princeton, Research Intern @MSFTResearch. Recently interested in theorem proving.

ID: 3334088338

linkhttp://hyzhao.me calendar_today19-06-2015 04:53:14

15 Tweet

53 Followers

50 Following

Abhishek Panigrahi (@abhishek_034) 's Twitter Profile Photo

ICML Conference **paper alert** Fine-tuning LLM on a task gives it new skill. Our “Skill localization” paper shows this skill lives in < 0.01% parameters — rest can be reverted to pre-trained values. 1/6 With Nikunj Saunshi,Haoyu Zhao,Sanjeev Arora Link: arxiv.org/abs/2302.06600

Kaifeng Lyu (@vfleaking) 's Twitter Profile Photo

Fine-tuning can improve chatbots (e.g., Llama 2-Chat, GPT-3.5) on downstream tasks — but may unintentionally break their safety alignment. Our new paper: Adding a safety prompt is enough to largely mitigate the issue, but be cautious about when to add it! arxiv.org/abs/2402.18540

Fine-tuning can improve chatbots (e.g., Llama 2-Chat, GPT-3.5) on downstream tasks — but may unintentionally break their safety alignment.
Our new paper: Adding a safety prompt is enough to largely mitigate the issue, but be cautious about when to add it! arxiv.org/abs/2402.18540
Sanjeev Arora (@prfsanjeevarora) 's Twitter Profile Photo

Quanta Magazine featured our work on emergence of skill compositionality (and its limitations) in LLMs among the CS breakthroughs of the year. tinyurl.com/5f5jvzy5. Work was done over 2023 Google DeepMind and Princeton PLI. Key pieces: (i) mathematical framework for

Ori Press (@ori_press) 's Twitter Profile Photo

Do language models have algorithmic creativity? To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️

Do language models have algorithmic creativity?

To find out, we built AlgoTune, a benchmark challenging agents to optimize 100+ algorithms like gzip compression, AES encryption and PCA. Frontier models struggle, finding only surface-level wins. Lots of headroom here!🧵⬇️