Dmitry Rybin (@dmitryrybin1) Twitter Tweets • TwiCopy

Dmitry Rybin

@dmitryrybin1

+ Follow

PhD at CUHK || ML for Math, Search, Planning || Grand First Prize at IMC || 人工智能+数学

ID: 1521776776380239873

linkhttps://www.linkedin.com/in/rybindmitry calendar_today04-05-2022 09:00:53

402 Tweet

1,1K Takipçi

127 Takip Edilen

Dmitry Rybin

@dmitryrybin1

4 months ago

UPD: This approach by Deep Think has an error that is not fixable Thanks to Daniel Litt for pointing out

thumb_up_off_alt50

chat_bubble_outline3

repeat5

shareShare

I almost always see LLM as: 1. context dependent distribution π(⋅ | context) 2. algorithm that transforms compute into correct answer with some probability e.g. 2 min of H100 -> 60% success rate on task X (more applicable to specific verifiable tasks in math or programming)

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Dmitry Rybin

@dmitryrybin1

4 months ago

Connection that i found some day in high-school: 1 + 2 + 3 + ... = ζ(-1) = - 1/12 1 + 2 + 3 + ... + n = n(n + 1)/2 The area between n(n + 1)/2 and x-axis is exactly 1/12 This is also true for all other values ζ(-k)

thumb_up_off_alt28

chat_bubble_outline1

repeat0

shareShare

Daniel Litt

@littmath

4 months ago

QTing this because it contains a (somewhat unusual for me) public prediction—that an AI tool will more-or-less autonomously resolve some mildly interesting open conjecture within the next year.

thumb_up_off_alt158

chat_bubble_outline7

repeat12

shareShare

Dmitry Rybin

@dmitryrybin1

4 months ago

I think METR eval provides a good mental model for LLM capabilities here: Generalist model like this (or actually system of agents) can solve any problems that take human experts ~1.5 hours. Within this framework, IOI Gold is not too surprising

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Khurram Javed

@khurramjaved_96

3 months ago

I did my best work after I stopped submitting to the big conferences three years ago. Not everyone has that privilege so I understand why people still try. A new venue would suffer from the same problems if a large number of papers at that venue lead to high paying jobs and

thumb_up_off_alt106

chat_bubble_outline0

repeat10

shareShare

Dmitry Rybin

@dmitryrybin1

3 months ago

In mathematics this is known as local-to-global properties. You have some photo or text where everything looks ok locally. You patch the parts together - and it’s broken ☹️ just like this Chessboard. Mathematicians use sheaves to describe when local data patched together

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Dmitry Rybin

@dmitryrybin1

3 months ago

Software RnD is being redefined in front of our eyes. SAT solvers were developed over many years and here they are drastically improved within ~3 months of playing with AlphaEvolve-style evolution BTW Nvidia is interested in SAT solver because it can make and verify chip design

thumb_up_off_alt10

chat_bubble_outline0

repeat3

shareShare

Dmitry Rybin

@dmitryrybin1

3 months ago

If you inspect the source code of Thinky Machines blog, you will find some hidden paragraphs and RL plots The only other AI lab that does this is OpenAI e.g. LaTeX source of o1 System Card has hidden evals for Person Identification with o1

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Dmitry Rybin

@dmitryrybin1

2 months ago

If you are working in pure math or theoretical computer science: keep in mind that there is a $500B multi-million GPU supercomputer pointed at automating your research

thumb_up_off_alt1,1K

chat_bubble_outline87

repeat55

shareShare

Dmitry Rybin

@dmitryrybin1

2 months ago

GRPO is not frontier and is broken in so many ways i don’t even know where to start. ~50% of GRPO budget is wasted on too easy/too difficult tasks (advantage = 0) This work fixes it:

thumb_up_off_alt363

chat_bubble_outline4

repeat26

shareShare

Dmitry Rybin

@dmitryrybin1

2 months ago

Encountered Daniel Litt in the wild

Encountered <a href="/littmath/">Daniel Litt</a> in the wild

thumb_up_off_alt305

chat_bubble_outline2

repeat5

shareShare

Dmitry Rybin

@dmitryrybin1

2 months ago

Would you believe that i found exactly the same things with Gemini Deep Think on some Erdos problems 2 months ago? But i never thought updating an outdated database entry from ‘open’ to ‘solved’ lands you a job at a frontier lab

thumb_up_off_alt351

chat_bubble_outline5

repeat6

shareShare

Sebastien Bubeck

@sebastienbubeck

2 months ago

My posts last week created a lot of unnecessary confusion*, so today I would like to do a deep dive on one example to explain why I was so excited. In short, it’s not about AIs discovering new results on their own, but rather how tools like GPT-5 can help researchers navigate,

thumb_up_off_alt1,1K

chat_bubble_outline151

repeat162

shareShare

Dmitry Rybin

@dmitryrybin1

a month ago

If you are among the researchers with access to DeepThink, AlphaProof, and AlphaEvolve - please share your experience publicly. Understanding capabilities of these tools is extremely important for math research landscape

thumb_up_off_alt30

chat_bubble_outline1

repeat1

shareShare