Jonathan Balloch (@jonathanballoch) Twitter Tweets • TwiCopy

Jonathan Balloch

@jonathanballoch

+ Follow

I mostly tweet about #ai, #robots, #science, @packers...

Robotics PhD student @GeorgiaTech studying #reinforcementlearning and #AI

Thought/opinions are mine

ID: 891231270

linkhttps://jballoch.com calendar_today19-10-2012 15:44:51

2,2K Tweet

376 Takipçi

1,1K Takip Edilen

Seohong Park

@seohong_park

6 months ago

Is RL really scalable like other objectives? We found that just scaling up data and compute is *not* enough to enable RL to solve complex tasks. The culprit is the horizon. Paper: arxiv.org/abs/2506.04168 Thread ↓

thumb_up_off_alt880

chat_bubble_outline9

repeat137

shareShare

Andrej Karpathy

@karpathy

5 months ago

Part 2 of this mystery. Spotted on reddit. In my test not 100% reproducible but still quite reproducible. 🤔

thumb_up_off_alt9,9K

chat_bubble_outline1,1K

repeat768

shareShare

Jonathan Balloch

@jonathanballoch

5 months ago

This is why y'all NEED to come to RLC2025 at UAlberta

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Scott Manley

@djsnm

5 months ago

Casey Handmer Marin county has argued it can’t build more housing because of water supply constraints, desalination was one solution considered, but then of course people complained that if the water supply problem was solved they’d have to be more honest as to why they didn’t want more

thumb_up_off_alt381

chat_bubble_outline9

repeat5

shareShare

Taelin

@victortaelin

4 months ago

Something about this kind of prompt is simply unfathomable to LLMs. They just can't perform better than chance, and I'm not sure why. Most people will dismiss this as just being "hard math stuff", but it is not, I swear. It is just alien to you because it is *niche*, thus, it

thumb_up_off_alt732

chat_bubble_outline127

repeat64

shareShare

François Chollet

@fchollet

4 months ago

When we're able to delegate something, to have some of our work done by an automation process or someone else, we automatically *feel* more productive. "Well, that was easy! At the very least it saved me a bunch of typing!" But the relationship between task delegation and

thumb_up_off_alt426

chat_bubble_outline22

repeat42

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

4 months ago

Huge number of accepted papers declining to present their work because they're scared to leave the United States. Great way to make people feel welcome and a part of the country.

thumb_up_off_alt238

chat_bubble_outline5

repeat11

shareShare

Lisan al Gaib

@scaling01

4 months ago

And that kids, is why we don't do drugs. You might not like it, but Grok-4 didn't get us any closer to AGI or ASI than o3. It's an incredible model, but it doesn't solve any of the previous models problems and just scaling RL won't get us there

thumb_up_off_alt1,1K

chat_bubble_outline81

repeat58

shareShare

Jonathan Balloch

@jonathanballoch

4 months ago

WTF??

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Jonathan Balloch

@jonathanballoch

4 months ago

Its wild that this works, but try appending to your LLM system prompts: "if you don't know say so, and then try to guess what it is most likely". Corrects the artificial confidence of models extremely well

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Jonathan Balloch

@jonathanballoch

3 months ago

Era of Reinforcement Learning

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

spec

@_opencv_

3 months ago

So are we going to talk about how AGI 2027 already missed its first predictions

thumb_up_off_alt4,4K

chat_bubble_outline139

repeat108

shareShare

Dhruv Batra

@dhruvbatradb

3 months ago

Just finished reading this — brilliant paper! An important question, an empirical observation with an explanation, and rejection of alternative plausible explanations — all the necessary pieces of the scientific method. Kudos to the authors!

thumb_up_off_alt67

chat_bubble_outline0

repeat4

shareShare

Georgia Channing

@cgeorgiaw

2 months ago

Way too many people think that AlphaFold "solved" ML for proteins. It didn't. It did revolutionize protein structure prediction, but that’s just one part of a much bigger puzzle. This is Part 1 of a series on what AlphaFold did (and didn’t) solve—and what comes next. ⬇️

thumb_up_off_alt1,1K

chat_bubble_outline18

repeat138

shareShare

Jonathan Balloch

@jonathanballoch

2 months ago

I think people underestimate how important and how challenging of a assertion this is. Symbolic logic and planning took the world very, very far, and it remains an open question how to seamlessly combine this with the embedding spaces learned in deep networks

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Jonathan Balloch

@jonathanballoch

2 months ago

Hero sigma energy

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Jonathan Balloch

@jonathanballoch

2 months ago

hey remember when the Browns traded away Baker Mayfield for a conditional pick while continuing to pay $10mil of his salary? lol

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Jonathan Balloch

@jonathanballoch

2 months ago

it is wild that we haven't figured out a more efficient way to make electricity from heat than: 1) make thing hot 2) use heat to make water into steam 3) steam turns turbine Like how are we still depending on steam turbines??

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Javi

@jvrsanch

2 months ago

I went from believing OpenAI pursued AGI to realizing that Sam is creating a new YC I can't believe how simple it is: 1. Sell the promise of AGI to raise billions 2. Build great models but not AGI and offer them via API 3. Let startups struggle to figure out what works and what

thumb_up_off_alt431

chat_bubble_outline36

repeat26

shareShare

Jeremy Howard

@jeremyphoward

a month ago

Be like Larry Wall :)

thumb_up_off_alt116

chat_bubble_outline5

repeat9

shareShare