Yinglun Zhu (@yinglun122) Twitter Tweets • TwiCopy

Yinglun Zhu

@yinglun122

+ Follow

Assistant Prof @UCRiverside. PhD @WisconsinCS. Research on Efficient ML, RL, and LLMs.

ID: 1598332829531873283

linkhttp://yinglunz.com calendar_today01-12-2022 15:07:13

41 Tweet

337 Takipçi

341 Takip Edilen

Yinglun Zhu

@yinglun122

6 months ago

Lucas Beyer (bl16) Thank you for your reply, big fan of your work! A large performance boost indeed comes from fixing evals (as in fig. 1), but TTM adds additional nontrivial gains, allowing SigLIP to outperform GPT-4.1. I think TTM can be deployed in certain cases, e.g., adapting models to

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

Rob Nowak

@rdnowak

6 months ago

Please share widely. The UW–Madison ECE Department is recruiting at the assistant, associate, or full professor in foundational AI and machine learning tinyurl.com/4kwh8mvm

thumb_up_off_alt89

chat_bubble_outline2

repeat40

shareShare

Dylan Foster 🐢

@canondetortugas

5 months ago

Happening this Tuesday 1:30 PST @ NeurIPS: Foundations of Imitation Learning: From Language Modeling to Continuous Control A tutorial with Adam Block & Max Simchowitz (Max Simchowitz).

thumb_up_off_alt270

chat_bubble_outline4

repeat31

shareShare

Dimitris Papailiopoulos

@dimitrispapail

2 months ago

512 parameters: a new top scorer for 10-digit addition with transformers! Who can beat it?

thumb_up_off_alt239

chat_bubble_outline8

repeat15

shareShare

Yinglun Zhu

@yinglun122

2 months ago

Direct training at rank 2 is hard but compression works!

thumb_up_off_alt9

chat_bubble_outline0

repeat1

shareShare

Dimitris Papailiopoulos

@dimitrispapail

2 months ago

The 10-digit addition transformer race is getting ridiculous and fun! Started with 6k params (Claude Code) vs 1,6k (Codex). We're now at 139 params hand-coded and 311 trained. I made AdderBoard to keep track: 🏆 Hand-coded: 139p: Wonderfall 177p: Xan Morice-Atkinson 🏆 Trained: 311p

thumb_up_off_alt229

chat_bubble_outline13

repeat20

shareShare

Yinglun Zhu

@yinglun122

2 months ago

Low rankness goes a long way!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Dimitris Papailiopoulos

@dimitrispapail

a month ago

“There are people who investigate what’s the smallest neural network that can do 10-digit multiplication and things like that. I think we could learn a lot just from evolving small AIs on simple problems.” -Terrence Tao

thumb_up_off_alt901

chat_bubble_outline20

repeat61

shareShare