Niels Mündler (@nielstron) Twitter Tweets • TwiCopy

Niels Mündler

@nielstron

+ Follow

Computer scientist. PhDing at @eth. Formal verification, Language Models. Compiling Python to FP @OpShinDev. Ex-Founder.

ID: 923755284

linkhttp://blog.nielstron.de calendar_today03-11-2012 18:24:49

1,1K Tweet

551 Followers

292 Following

Niels Mündler

@nielstron

2 months ago

I want a cafe, but for protein shakes

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

MathArena goes visual: We evaluated models such as GPT-5 on Math Kangaroo 2025, a recent contest for ages 6-19 where most tasks require visual reasoning. Models struggle the most with tasks for younger kids. For example, they get this task for 1st graders only 3% of the time 🧵

thumb_up_off_alt71

chat_bubble_outline2

repeat19

shareShare

Niels Mündler

@nielstron

a month ago

It's a shame, because I really like the approach of this paper, but why did they include a prompt injection in the arXiv version? public shame :(

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare

Niels Mündler

@nielstron

a month ago

cool stuff! personally very frustrated by hidden bugs and debugging unwanted changes in diffs

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

miru

@miru_why

16 days ago

Niklas Sheth Ron Arel Intology their 'superhuman' ai cleverly assigned all the work to non-default streams, which means the correctness test (which waits on all streams) passes, while the profiling timer (which only waits on the default stream) is tricked into reporting a huge speedup

<a href="/niklassheth/">Niklas Sheth</a> <a href="/ronusedh/">Ron Arel</a> <a href="/IntologyAI/">Intology</a> their 'superhuman' ai cleverly assigned all the work to non-default streams, which means the correctness test (which waits on all streams) passes, while the profiling timer (which only waits on the default stream) is tricked into reporting a huge speedup

thumb_up_off_alt558

chat_bubble_outline11

repeat33

shareShare

Niels Mündler

@nielstron

12 days ago

I will attend #neurips2025 chat me up about - llms for code - constrained decoding - diffusion LLMs

thumb_up_off_alt6

chat_bubble_outline2

repeat0

shareShare

Niels Mündler

@nielstron

11 days ago

two weeks into #iclr rebuttal, so far 1 desk reject, 1 withdrawal and 1 reviewed paper that actually submitted a rebuttal :( to be clear, the remainder are *not* clear accepts.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Niels Mündler

@nielstron

10 days ago

there are many good reasons to reject a paper anonymously there are no good reasons to change your score after being deanonymized

thumb_up_off_alt12

chat_bubble_outline0

repeat0

shareShare

Niels Mündler

@nielstron

9 days ago

I'll be in SF for a week from Dec. 8, and would love to learn about any and all problems you are facing when using LLMs (in particular for code). DM me if you'd like to grab an ice cream and chat 🍨

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Niels Mündler

Niels Mündler

Nikola Jovanović @ ICLR 🇸🇬

Niels Mündler

Niels Mündler

miru

Niels Mündler

Niels Mündler

Niels Mündler

Niels Mündler