Dave (@dmvaldman) Twitter Tweets • TwiCopy

2 days ago

Great paper, arguing emergent abilities are only a function of pre training loss and not model/dataset size.

ie, if you (inefficiently) overtrain a small model to the loss of GPT4, you'd get all the abilities of GPT4.

arxiv.org/abs/2403.15796

account_circle

swyx

@swyx

4 days ago

demos n chill 8 thread

Dave garmin dashboard “measuring my brain juice”

thumb_up_off_alt34

repeat4

account_circle

Dave

5 days ago

Fun thought experiment: what if the input into Sora wasn't text, but the motion sensor data of a robot.

It turns its head, and the scene rotates. It lifts its arm, and a hand comes into view, etc. Doesn't need eyes.

account_circle

Dave

1 week ago

Phi-3 'paper' TLDR

thumb_up_off_alt20

repeat1

account_circle

Dave

1 week ago

ORPO is the shampoo & conditioner 2 in 1 of RLHF

we've bundled too far

thumb_up_off_alt3

account_circle

Dave

1 week ago

If the outputs are the same, but the means are different, Yann would be so much happier.

Too bad no one else would care.

thumb_up_off_alt5

account_circle

Dave

1 week ago

I was so worried the big AI labs were no longer publishing their research and I'd be left behind.

But it turns out it's all still train big models on lots of data.

account_circle

Dave

1 week ago

Technology is making us less conscious, but consciousness overall is increasing.

thumb_up_off_alt6

account_circle

Dave

1 week ago

An interesting AI math question: can you generate text with higher entropy than human text with an LLM? I'm looking at you 'Backdoors of Claude' people.

If so, how can a compression machine also be a decompression machine?

account_circle

Dave

1 week ago

When I read something that changes my mind, I find it hard to believe that this was caused by a change in the strengths of my neurons. Am I wrong?

thumb_up_off_alt6

account_circle

Dave

1 week ago

This paper is an implementation of self-awareness masquerading as 'making quadratic attention more efficient'.

thumb_up_off_alt11

account_circle

Dave

2 weeks ago

I'm not fine-tuning! I'm reality constructing, belief propagating, personality incepting.

thumb_up_off_alt3

account_circle

FleetingBits

@fleetingbits

2 weeks ago

what happens when the means of production becomes the labor as well?

thumb_up_off_alt8

repeat2

account_circle

Dave

2 weeks ago

I was pretty happy with my next move for black. Desperate times.

thumb_up_off_alt3

account_circle

Dave

2 weeks ago

from me as you

account_circle

Riley Goodside

@goodside

3 weeks ago

AI-generated sad girl with piano performs the text of the MIT License

account_circle

Dave

3 weeks ago

300,000 years ago System 2 came out from System 1. But suddenly, a few years ago and to everyone's shock, System 1 came out of System 2! Now, there's a rush to build System 2 again. And then, sometime in the future, it will build a new System 1, on some distant planet, probably.

thumb_up_off_alt5