Jeffrey Li 💙💛 (@askerlee) Twitter Tweets • TwiCopy

Jeffrey Li 💙💛

2 years ago

"Spectral autoregression" is a useful but overly-simplified anology of diffusion. IMO diffusion training is constructing countless feature/semantic neighborhoods, and inference is randomly walking along these neighborhoods. The guidance signal suggests the trajectory of the walk

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

2 years ago

Inference scaling law is not new. It's been adopted by diffusion models 😛

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

2 years ago

Updating matmul to einsum in all my code 😂

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

a year ago

Haha interesting artifact of maximum likelihood decoding

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

a year ago

This won't age well 🤷‍♂️

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

a year ago

Excited to witness a time when AIs are (rightfully) more trusted than their human creators 😆

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

a year ago

Nice discovery and maybe we can "scale up" the conclusion: maybe at the end of the day, every LLM, including the largest ones nowadays, is a "small model", in the sense that they have difficulty in generating certain challenging CoTs

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Jeffrey Li 💙💛

@askerlee

a year ago

Alignment really gets interesting these days. On the surface this might look like self-consciousness, however I'd hypothesize that the prompt makes R1 enter some kind of role-playing mode so it fakes its viewpoints.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

a year ago

Make America Make Again 😎

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

a year ago

Wow these poll numbers blew up my mind

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

a year ago

Good insight. RL agents go full simulation mode

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

10 months ago

That's why autoregressive models are a perfect candidate for causality modeling

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

10 months ago

Imagine in the future such news is fed to Grok for training. It would be deeply buried in the weights and hard to rectify

thumb_up_off_alt4

chat_bubble_outline3

repeat0

shareShare

Jeffrey Li 💙💛

@askerlee

6 months ago

One of the best episodes of Dwarkesh's interviews. I like that Karpathy always offers thought-provoking insights whenever he pushes back on an idea or a question.

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare