Saurabh Shah (@saurabh_shah2) Twitter Tweets • TwiCopy

Saurabh Shah

@saurabh_shah2

+ Follow

training olmos @allen_ai prev @Apple @Penn 🎤dabbler of things🎸 🐈‍⬛enjoyer of cats 🐈 and mountains🏔️he/him

ID: 1599170691714138114

linkhttps://learnycurve.substack.com calendar_today03-12-2022 22:36:25

957 Tweet

1,1K Followers

1,1K Following

Saurabh Shah

@saurabh_shah2

5 months ago

mostly agree but you shouldn't *only* make decisions based on asymptotes. The constants matter. e.g. Cursor has already created a ridiculous amount of value and that will continue. They'll exit, pivot, or start training their own models before the asymptotic trends kick in

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

5 months ago

going backpacking with my friends. spreading the gospel. everyone will come out of the forest bitter lesson-pilled

thumb_up_off_alt10

chat_bubble_outline4

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

5 months ago

Hmmm 🤔

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

5 months ago

Why is it important that there exist leading open models built in the west? "We should have open models that reflect western values" can feel vague and hand-wavy. Here's a deepseek screenshot to make it concrete. This is not a model you can do e.g. factuality research on

thumb_up_off_alt10

chat_bubble_outline2

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

5 months ago

i really didn't think the gpt-oss was gonna be this good holy moly

thumb_up_off_alt9

chat_bubble_outline1

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

5 months ago

got setup in 2 mins. good release.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

5 months ago

a second openAI open-weights model has hit hf

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

5 months ago

holy shit they're gonna release 5 more open models

thumb_up_off_alt10

chat_bubble_outline1

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

Does Isomorphic Labs write anything in the public? Where can I learn more abt AI for drug discovery (or knowledge discovery in general)? Is anyone e.g. using alphafold as an RL env to derive rewards from?

thumb_up_off_alt8

chat_bubble_outline2

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

uv is good

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

Time to finally learn what a protein actually is

thumb_up_off_alt26

chat_bubble_outline1

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

nooo dw your 7B dense is actually perfect. The big MoE's scare me 😅

thumb_up_off_alt16

chat_bubble_outline1

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

yayyy

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

I intend to acquire cohere after its acquisition of perplexity after their acquisition of TikTok and google chrome

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

life update: for those who don’t know, i joined Ai2 a few months ago to work on open source AGI. incredibly excited about what we’re building 🚀

thumb_up_off_alt148

chat_bubble_outline2

repeat3

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

Introducing synth-bench -> how good is your LM at generating data for other LM's? Olmo best is weirdly good at this, apparently!

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

david is one of the most thoughtful people I've met in general, and he's definitely the most thoughtful person I've met when it comes to how to interpret eval scores of language models. ty david + team for helping us make sound decisions!! (P.S. go follow David Heineman)

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Saurabh Shah

@saurabh_shah2

4 months ago

The space of sequential ops is more expressive, but parallel is more efficient. Beauty of the transformer is it kinda defies this tradeoff

thumb_up_off_alt7

chat_bubble_outline1

repeat0

shareShare