Tom Varsavsky (@tomvarsavsky) 's Twitter Profile
Tom Varsavsky

@tomvarsavsky

ML @Wayve | Previously an academic @UCL and @KingsImaging working on AI in NeuroImaging and COVID-19

ID: 281066401

calendar_today12-04-2011 15:19:50

262 Tweet

491 Followers

852 Following

Tom Varsavsky (@tomvarsavsky) 's Twitter Profile Photo

Important problem raised at the Transformer panel, we are using models which are too smart for problems which are too easy. trillion parameter model being used to add 2+2. How can we match the complexity of the problem to the complexity of the model? #GTC2024

Important problem raised at the Transformer panel, we are using models which are too smart for problems which are too easy. trillion parameter model being used to add 2+2. How can we match the complexity of the problem to the complexity of the model? #GTC2024
Tom Varsavsky (@tomvarsavsky) 's Twitter Profile Photo

Another great piece by Air Street Capital on the economics of frontier LLM models Spend 100M on a jet and it will fly for 30 years, spend 100M training an LLM and it will last for 30 weeks. press.airstreet.com/p/alchemy-is-a…

Tom Varsavsky (@tomvarsavsky) 's Twitter Profile Photo

ChatGPT has close to 200M users because it became a household name. A well marketed basic chat interface with chat histories connected to Llama3 (which costs 1/20th the price to run) for $5 per month could take a decent market share away and have solid unit economics.

Tom Varsavsky (@tomvarsavsky) 's Twitter Profile Photo

Such exciting work 🀩 Anthropic introduces a completely new way to customise a pre-trained LLM at inference time by activating a feature with a known meaning. An open-source version on Llama3 would inspire new research and applications.

Tom Varsavsky (@tomvarsavsky) 's Twitter Profile Photo

exponentially lower cost per token at GPT4-level performance is a very safe bet. exponential improvement in intelligent capabilities is a much riskier bet. people seem to be indiscriminately betting on both of these trends when investing capital and building AI product roadmaps.

Tom Varsavsky (@tomvarsavsky) 's Twitter Profile Photo

Duplicated efforts on scaling have slowed down the progress towards AGI we would have had 2020-24. We need more step-change breakthroughs (transformers, diffusion, CL, etc) which so far have all come from open science.

Tom Varsavsky (@tomvarsavsky) 's Twitter Profile Photo

I find it remarkable that my 6-month old daughter has better fine-motor grasping skills than the best robotic arms. All learned with little supervision and only a small amount of imitation.

Tom Varsavsky (@tomvarsavsky) 's Twitter Profile Photo

Leaving this as a prediction for the future - We've tapped out on exam style evals, next stage is agent evals. These will initially be in controlled environments and eventually be measured simply in the revenue that they can generate in the real world.