Simon Frieder (@friederrrr) 's Twitter Profile
Simon Frieder

@friederrrr

ML Researcher @Oxford. AIMO Prize Manager. Mathematician and Computer Scientist.
friederrr.org

Opinions my own.

ID: 1611796888507846658

calendar_today07-01-2023 18:48:37

92 Tweet

121 Followers

38 Following

Bartosz Naskręcki (@nasqret) 's Twitter Profile Photo

GPT-5-Pro solved, in just 15 minutes (without any internet search), the presentation problem known as “Yu Tsumura’s 554th Problem.” arxiv.org/pdf/2508.03685 This is the first model to solve this task completely. I expect more such results soon — the model demonstrates a strong

GPT-5-Pro solved, in just 15 minutes (without any internet search), the presentation problem known as “Yu Tsumura’s 554th Problem.”

arxiv.org/pdf/2508.03685

This is the first model to solve this task completely. I expect more such results soon — the model demonstrates a strong
Simon Frieder (@friederrrr) 's Twitter Profile Photo

When ML researchers from different groups find out they have the same idea, more often than not it becomes a race who finishes first, and gets the credit, and becomes first author. When mathematicians find out they have the same idea, more often than not they join forces, with

Simon Frieder (@friederrrr) 's Twitter Profile Photo

It's time to design the next-gen math curriculum: a standard core of mathematics, but then a whole new slew of tools and approaches. First comes Lean (which makes it necessary to teach quite a bit of software engineering best practices, including git). Then comes a set of

Simon Frieder (@friederrrr) 's Twitter Profile Photo

Can you solve this Olympiad-level problem? This is the mind of problems LLMs have to solve to compete at the third AI Math Olympiad.

Can you solve this Olympiad-level problem?

This is the mind of problems LLMs have to solve to compete at the third AI Math Olympiad.
Simon Frieder (@friederrrr) 's Twitter Profile Photo

AIMO2 ran for 5 months and had a total 16,000 entrants (people that registered, but not necessarily submitted a model - those are known as "participants"). AIMO3 is running for less than a day and already has 2,000 entrants. At this pace we might need to source more H100s :D

Simon Frieder (@friederrrr) 's Twitter Profile Photo

Longevity is a very interesting subject, at the intersection of biology & benchmarks. I recognize some ML names in this paper that did an analysis of health interventions at a scale never before achieved. Bryan Johnson take note ;) biorxiv.org/content/10.110…

Longevity is a very interesting subject, at the intersection of biology & benchmarks. 

I recognize some ML names in this paper that did an analysis of health interventions at a scale never before achieved. 

<a href="/bryan_johnson/">Bryan Johnson</a> take note ;)

biorxiv.org/content/10.110…
Simon Frieder (@friederrrr) 's Twitter Profile Photo

One more nail in the coffin for a broken reviewing system. Who knows how many people silently used this backdoor to see who gave them bad scores -> potentially career-damaging. It would be much better to have a comment section in arXiv, this would solve 80% of the existing

One more nail in the coffin for a broken reviewing system. Who knows how many people silently used this backdoor to see who gave them bad scores -&gt; potentially career-damaging. 

It would be much better to have a comment section in arXiv, this would solve 80% of the existing
Simon Frieder (@friederrrr) 's Twitter Profile Photo

A 5-point update on how the latest release of DeepSeek-Math-V2 affects the AIMO. In the meantime, working hard to get this 700GB monster to run on my 8x H100 to test it. Can't wait for Unsloth and some of the other quantization wizards to release smaller versions!

AIMO Prize (@aimoprize) 's Twitter Profile Photo

AIMO3 is full of surprises: week 2 (out of 21) just concluded. After a race in the first week that had us both biting our nails to see how quickly the leaderboard is rising and cheering for the progress of open-weight LLMs, the leaderboard suddenly ground to halt.

AIMO3 is full of surprises: week 2 (out of 21) just concluded. After a race in the first week that had us both biting our nails to see how quickly the leaderboard is rising and cheering for the progress of open-weight LLMs, the leaderboard suddenly ground to halt.
Simon Frieder (@friederrrr) 's Twitter Profile Photo

2023: It's hard to devise an LLM that solves a math problem. 2025: It's hard to devise a math problem that stumps and LLM. (...this in the context of competitive math questions, but we'll also get to research-level math soon)