Bayesian (@bayesian0_0) 's Twitter Profile
Bayesian

@bayesian0_0

top 5 manifold forecaster manifold.markets/Bayesian mostly predicting ai development

ID: 1808407868200345602

calendar_today03-07-2024 07:50:28

131 Tweet

118 Takipçi

913 Takip Edilen

Trevor Levin (@trevposts) 's Twitter Profile Photo

Back for my monthly few minutes on Twitter and the main thought I want to share is that it seems like AI progress in H1 2025 was much slower than I feared after the crazy o3 benchmark numbers in December, so I now put less weight on very short (e.g. AI 2027-type) timelines.

Sheryl Hsu (@sherylhsu02) 's Twitter Profile Photo

The model solves these problems without tools like lean or coding, it just uses natural language, and also only has 4.5 hours. We see the model reason at a very high level - trying out different strategies, making observations from examples, and testing hypothesis.

Bayesian (@bayesian0_0) 's Twitter Profile Photo

thought this too, and with nat lang IMO gold and upcoming GPT-5 I am looking forward to seeing a really important new data point wrt this

Neel Nanda (@neelnanda5) 's Twitter Profile Photo

Speaking as a past IMO contestant, this is impressive but misleading - gold vs silver is meaningless, 1 pt below gold vs borderline gold is noise The impressive bit is using a general reasoning model, not a specialised system, and no verified reward. Peak AI maths is unchanged

Bayesian (@bayesian0_0) 's Twitter Profile Photo

I srsly thought you could get to agi just by scaling 2023 LLMs, and I still think so. I never thought that was likely to be the first arch that got us to AGI, because new algorithmic improvements are found constantly. a lot of miscommunication happens around likely vs possible

Bayesian (@bayesian0_0) 's Twitter Profile Photo

what are the odds that gpt-5's pretraining base is bigger than gpt-5 is, and gpt-5-main is the product of a distillation, vs the pretraining base being the same size as gpt-5? i'd make a market if this was publicly verifiable but since it isn't i ask y'all

Daniel Eth (yes, Eth is my actual last name) (@daniel_271828) 's Twitter Profile Photo

Kind feel like there were pretty similar steps in improvement for each of: GPT2 -> GPT3, GPT3 -> GPT4, and GPT4 -> GPT5. It’s just that most of the GPT4 -> GPT5 improvement was already realized by o3, and the step from there to GPT5 wasn’t that big

Matthew Barnett (@matthewjbar) 's Twitter Profile Photo

Every consumer good has consumer surplus, so this explanation is too general to explain much about AI in particular. A better explanation for why AI isn't meaningfully showing up in GDP is that AI has simply had a relatively small impact on economic production so far.

Epoch AI (@epochairesearch) 's Twitter Profile Photo

The higher the FrontierMath difficulty tier, the lower GPT-5 scored. This suggests a correlation between what mathematicians find difficult and what makes problems harder for AI systems to solve.

The higher the FrontierMath difficulty tier, the lower GPT-5 scored. This suggests a correlation between what mathematicians find difficult and what makes problems harder for AI systems to solve.
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxestex) 's Twitter Profile Photo

I repeat that this is cope for boomers who didn't do the math the US has enough capacity to power their AGI race. Where this isn't true, hyperscalers will complete private power plants soon enough. China doesn't have nearly enough compute to make use of the power advantage.

I repeat that this is cope for boomers who didn't do the math
the US has enough capacity to power their AGI race. Where this isn't true, hyperscalers will complete private power plants soon enough. China doesn't have nearly enough compute to make use of the power advantage.
Jeffrey Ladish (@jeffladish) 's Twitter Profile Photo

Update your AI timelines based on how pretrain + RL scaling is going, not based on OpenAI's naming conventions. GPT-5 is just the model OpenAI decided to call 5. They could have called GPT-4.5 or o3 "GPT-5" if they wanted to