Sholto Douglas(@_sholtodouglas) 's Twitter Profileg
Sholto Douglas

@_sholtodouglas

Scaling Gemini @Deepmind - working towards intelligence too cheap to meter

ID:968053955845484545

calendar_today26-02-2018 09:23:43

298 Tweets

14,8K Followers

857 Following

Corry Wang(@corry_wang) 's Twitter Profile Photo

1/ A followup thought: the more time I spend working in tech, the more inclined I’ve become towards believing in the “god of straight lines on charts”

Understanding the drivers of Moore’s law is hard, but blindly extrapolating Moore’s law was easy… and 100% worked for 50 years

1/ A followup thought: the more time I spend working in tech, the more inclined I’ve become towards believing in the “god of straight lines on charts” Understanding the drivers of Moore’s law is hard, but blindly extrapolating Moore’s law was easy… and 100% worked for 50 years
account_circle
Enrique Piqueras(@epiqueras1) 's Twitter Profile Photo

Heard this week:

“Somehow in the past week the MK turned into grad school. I feel like a TA again. What’s going on here? Is this a pysop to make me and JAX Matt nostalgic?” - JAX Roy

“Stack more layers” they said.
“Forget about your reproducing kernel Hilbert spaces” they said.…

account_circle
Sholto Douglas(@_sholtodouglas) 's Twitter Profile Photo

Thanks Corry :)
If you could all read Corry’s internal strategy briefings they’d be some of the most read posts on here - what he can share in tweets is worth watching very carefully!

account_circle
Aravind Srinivas(@AravSrinivas) 's Twitter Profile Photo

Sholto is a great example for people who are outside the mainstream AI community that they can come learn things quickly and have a massive impact. Google is lucky to have him.

account_circle
rahul(@0interestrates) 's Twitter Profile Photo

sf is a magical place cause it’s just your boys hanging out talking about the similarities between cerebellum and the tranformer residual stream and how much faster g*mini would be with 10x more compute and it turns into a whole podcast

account_circle
Trenton Bricken(@TrentonBricken) 's Twitter Profile Photo

.Dwarkesh Patel asked fantastic questions and Sholto Douglas was a wonderful co-guest.

I’m lucky to call them both friends and to have all our conversations.

I hope you find this conversation interesting!

account_circle
Sholto Douglas(@_sholtodouglas) 's Twitter Profile Photo

One of the best parts of SF is hanging out with my good friends Dwarkesh Patel and Trenton Bricken.

Dwarkesh is the best interviewer in the world - and I hope this gives you a good feeling for what’s it’s like to be on the ground in the labs. It only gets crazier from here!

account_circle
Dwarkesh Patel(@dwarkesh_sp) 's Twitter Profile Photo

Had so much fun chatting with my friends Trenton Bricken and Sholto Douglas.

No way to summarize it, except:

This is the best context dump out there on how LLMs are trained, what capabilities they're likely to soon have, and what exactly is going on inside them.

You would be…

account_circle
Dwarkesh Patel(@dwarkesh_sp) 's Twitter Profile Photo

“There’s a future where the distinction between small and large models disappears.

And with long context, fine tuning might disappear.

You can imagine a future where you just have a dynamic bundle of compute.

And infinite context specializes your model.”

Sholto Douglas

account_circle
Daniel Gross(@danielgross) 's Twitter Profile Photo

Given exponential increase in training costs, compute multipliers might become the most coveted secrets on earth. Some of those will be in torch.nn; many will be in silicon.

Given exponential increase in training costs, compute multipliers might become the most coveted secrets on earth. Some of those will be in torch.nn; many will be in silicon.
account_circle
Dwarkesh Patel(@dwarkesh_sp) 's Twitter Profile Photo

Given that you need 100x more effective compute between model generations, if we don’t get AGI by GPT-7, will we just never get it?

Sholto Douglas: “GPT-4 costs, let's call it, $100 million. The $1B, $10B, and $100B run, all seem very plausible by private company standards. You…

account_circle