Zeyuan Allen-Zhu (@ZeyuanAllenZhu) Twitter Tweets • TwiCopy

2 weeks ago

可愛らしい絵。 Eugeo Zuberg はパート 3.2 でも見つかります。はアリス・ユージオ・ズバーグを使おうと思っていたが、アリスはアーニャに近すぎる。正直に言うと、『LMの物理学』に決める前は『Alicization』というシリーズタイトルを考えていました

thumb_up_off_alt8

account_circle

I shouldn't say common crawls are 'junks'. Thanks to Common Crawls CTO for correcting me. What we meant is, lots of knowledge from CC (e.g. serial number of a random product) may not be useful. We synthetically generate data to mimic such knowledge, and we refer to that as junk.

thumb_up_off_alt16

repeat0

account_circle

Zeyuan Allen-Zhu

2 weeks ago

Incredibly honored to have worked with Avi as his postdoc. Avi's vision is certainly beyond the theory of computation. He asked me in 2016 whether I believe gradient descent can solve everything. He has probably envisioned AGI at that point. 👍

thumb_up_off_alt75

account_circle

Zeyuan Allen-Zhu

1 month ago

Another example of how simple things work

thumb_up_off_alt42

account_circle

Zeyuan Allen-Zhu

2 months ago

Did anyone notice: if paper title has period (or perhaps colon) in it, I will lose many citations. For instance Quanquan Gu 's Rephrase paper cites Part 3.2 but it isn't on Google Scholar. scholar.google.com/scholar?oi=bib…
Should I use Part 3A, 3B, 3C instead? Who else cited our work?

thumb_up_off_alt32

account_circle

Zeyuan Allen-Zhu

2 months ago

Amazing team to work with!

thumb_up_off_alt12

repeat0

account_circle

Zeyuan Allen-Zhu

3 months ago

repeat1

account_circle

Zeyuan Allen-Zhu

4 months ago

Truly heartbroken to see that nowadays we have to explicitly, reiterate that calling for genocide (against any group) is a violence and should be prohibited.

thumb_up_off_alt7

repeat1

account_circle

Zeyuan Allen-Zhu

4 months ago

Huge thanks to my coauthors and especially the first author Cathy Li who is just amazing at handling this huge project. Before this, I thought my coauthors were just being crazy... I was wrong, LLMs (+ human designs) can break some Crypto systems.

thumb_up_off_alt14

repeat5

account_circle

Zeyuan Allen-Zhu

5 months ago

Many starts to talk about 'reread my request and try again'. For knowledge questions, we made it clear why 'try again' works. Knowledge is first loaded; and in the repeated run, model sees it and can manipulate knowledge in context. Examples in the figs such as 'Tell me why.'

thumb_up_off_alt15

repeat1