Tom Varsavsky (@tomvarsavsky) Twitter Tweets • TwiCopy

Tom Varsavsky

@tomvarsavsky

+ Follow

ML @Wayve | Previously an academic @UCL and @KingsImaging working on AI in NeuroImaging and COVID-19

ID: 281066401

calendar_today12-04-2011 15:19:50

262 Tweet

491 Followers

852 Following

Tom Varsavsky

@tomvarsavsky

a year ago

Transformers were almost called "CargoNet" #GTC2024

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Important problem raised at the Transformer panel, we are using models which are too smart for problems which are too easy. trillion parameter model being used to add 2+2. How can we match the complexity of the problem to the complexity of the model? #GTC2024

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

Introducing pre-pre-prints, leaked figures in tweets getting everyone hyped 🍿

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

Come find out what we've been cooking 👀

thumb_up_off_alt6

chat_bubble_outline0

repeat1

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

Another great piece by Air Street Capital on the economics of frontier LLM models Spend 100M on a jet and it will fly for 30 years, spend 100M training an LLM and it will last for 30 weeks. press.airstreet.com/p/alchemy-is-a…

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

ChatGPT has close to 200M users because it became a household name. A well marketed basic chat interface with chat histories connected to Llama3 (which costs 1/20th the price to run) for $5 per month could take a decent market share away and have solid unit economics.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

Such exciting work 🤩 Anthropic introduces a completely new way to customise a pre-trained LLM at inference time by activating a feature with a known meaning. An open-source version on Llama3 would inspire new research and applications.

thumb_up_off_alt8

chat_bubble_outline1

repeat2

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

exponentially lower cost per token at GPT4-level performance is a very safe bet. exponential improvement in intelligent capabilities is a much riskier bet. people seem to be indiscriminately betting on both of these trends when investing capital and building AI product roadmaps.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

Duplicated efforts on scaling have slowed down the progress towards AGI we would have had 2020-24. We need more step-change breakthroughs (transformers, diffusion, CL, etc) which so far have all come from open science.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

scaling is no longer delivering...

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Tom Varsavsky

@tomvarsavsky

a year ago

I find it remarkable that my 6-month old daughter has better fine-motor grasping skills than the best robotic arms. All learned with little supervision and only a small amount of imitation.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Alberto Rizzoli

@albertorizzoli

a year ago

🌞 10 weeks of releases in 1 update. What we chose to build and why 🧵

thumb_up_off_alt23

chat_bubble_outline2

repeat6

shareShare

Tom Varsavsky

@tomvarsavsky

8 months ago

Leaving this as a prediction for the future - We've tapped out on exam style evals, next stage is agent evals. These will initially be in controlled environments and eventually be measured simply in the revenue that they can generate in the real world.

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare