Yifei He (@heyifei99) 's Twitter Profile
Yifei He

@heyifei99

CS PhD student @IllinoisCDS · Alumnus @UMichCSE & @sjtu1896 · Intern @Microsoft @AmazonScience

ID: 3427796031

linkhttp://yifei-he.github.io calendar_today17-08-2015 10:49:11

33 Tweet

164 Followers

444 Following

Yifei He (@heyifei99) 's Twitter Profile Photo

I will be at #EMNLP2024 in Miami to present this work. Check out our poster at poster session F on 11/14 (Thu) at 10:30-12:00 in Riverfront Hall!

Shunyu Yao (@shunyuyao12) 's Twitter Profile Photo

My mental model of a good paper: 1. Solves some questions 2. Opens up new questions or directions for future research Reviewers nowadays: - Ignores things in 1 - Asks why you have not done everything in 2

Yifei He (@heyifei99) 's Twitter Profile Photo

Check out one of my earlier works during PhD on how to tackle large distribution shift by learning from generated intermediate domains!

Cindy Zeng (@cindy2000_sh) 's Twitter Profile Photo

[1/N] #ML #LLM #ModelMerging Paper: arxiv.org/pdf/2502.01015 We build theory to explain why task arithmetic works, and propose Task Vector Bases, a scalable model editing method grounded in it. With Yifei He, @youweiqiu, Yifan, Hubert, myamada0, Han Zhao. 🧵 Dive in below:

[1/N] #ML #LLM #ModelMerging
Paper: arxiv.org/pdf/2502.01015 We build theory to explain why task arithmetic works, and propose Task Vector Bases, a scalable model editing method grounded in it. 
With <a href="/heyifei99/">Yifei He</a>, @youweiqiu, Yifan, Hubert, <a href="/myamada0/">myamada0</a>, <a href="/hanzhao_ml/">Han Zhao</a>.
🧵 Dive in below:
Yifei He (@heyifei99) 's Twitter Profile Photo

The mutilingual scaling law is accepted at #ACL2025! Check out how to compute optimal sampling ratios of langauges to design your multilingual pretraining mixture for any model size!

Yifei He (@heyifei99) 's Twitter Profile Photo

🚀Excited to introduce WebSTAR: a scalable data synthesis and filtering pipeline for building Computer Use Agents (CUAs) without human annotations! 🛑What is limiting scalable offline training of CUAs? 1️⃣ Scarcity of high-quality trajectory data. 2️⃣ High cost of GUI interaction.