Core Francisco Park (@corefpark) Twitter Tweets • TwiCopy

🚨 New paper drop! 🚨 🤔 When a transformer sees a sequence that could be explained by many rules, which rule does it pick? It chooses the simplest sufficient one! 🧵👇

thumb_up_off_alt348

chat_bubble_outline5

repeat49

shareShare

Core Francisco Park

@corefpark

4 months ago

Waiting for indexing of github... :)

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

thumb_up_off_alt51

chat_bubble_outline1

repeat16

shareShare

Core Francisco Park

@corefpark

4 months ago

Amazing! I was wondering why there is no good curated dataset on humans playing game of 24. Here it is now :)

thumb_up_off_alt6

chat_bubble_outline1

repeat2

shareShare

Core Francisco Park

@corefpark

4 months ago

Very cool experiment

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Michael Albergo

@msalbergo

4 months ago

Dear NeurIPS Conference -- it seems OpenReview is down entirely, and we cannot submit reviews for the upcoming review deadline tonight. Please share if you are having a similar issue. #neurips2025

thumb_up_off_alt58

chat_bubble_outline2

repeat10

shareShare

Fenil Doshi

@fenildoshi009

4 months ago

🧵 What if two images have the same local parts but represent different global shapes purely through part arrangement? Humans can spot the difference instantly! The question is can vision models do the same? 1/15

thumb_up_off_alt571

chat_bubble_outline4

repeat103

shareShare

Rylan Schaeffer

@rylanschaeffer

4 months ago

New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track Joint w/ Sanmi Koyejo Joshua Kazdan Yegor Denisov-Blanch Francesco Orabona Koustuv Sinha Jessica Zosa Forde Jesse Dodge Susan Zhang Brando Miranda Matthias Gerstgrasser isha Elyas Obbad 1/6

New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track

Joint w/ <a href="/sanmikoyejo/">Sanmi Koyejo</a> <a href="/JoshuaK92829/">Joshua Kazdan</a> <a href="/yegordb/">Yegor Denisov-Blanch</a> <a href="/bremen79/">Francesco Orabona</a> <a href="/koustuvsinha/">Koustuv Sinha</a> <a href="/in4dmatics/">Jessica Zosa Forde</a> <a href="/JesseDodge/">Jesse Dodge</a> <a href="/suchenzang/">Susan Zhang</a> <a href="/BrandoHablando/">Brando Miranda</a> <a href="/MGerstgrasser/">Matthias Gerstgrasser</a> <a href="/is_h_a/">isha</a> <a href="/ObbadElyas/">Elyas Obbad</a>

1/6

thumb_up_off_alt400

chat_bubble_outline12

repeat49

shareShare

Core Francisco Park

@corefpark

4 months ago

100%

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

K Srinivas Rao

@sriniously

4 months ago

The real developer moat isn't coding anymore. LLMs can pump out functions faster than most of us can type. The moat is in the spaces between the code. It's knowing why your database is slow when the logs show nothing obvious. It's understanding that the "simple" feature request

thumb_up_off_alt471

chat_bubble_outline22

repeat44

shareShare

Core Francisco Park

@corefpark

4 months ago

- 8000 USD / Mtok - Input: 10 tok/s - Output: 2 tok/s - Latency: 10 mins ~ 2 weeks - 12h downtime per day Integrating this agent into a multi agent system is challenging......

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Yongyi Yang

@yongyiyang7

3 months ago

What drives in-context learning in LLMs? New paper: Provable Low-Frequency Bias of In-Context Learning of Representations. We show LLMs have a low-frequency bias when learning representations in context, offering a theoretical answer to several previously open questions. 🧵👇

thumb_up_off_alt27

chat_bubble_outline1

repeat7

shareShare

Ekdeep Singh Lubana

@ekdeepl

3 months ago

Super excited to be joining Goodfire! I'll be scaling up the line of work our group started at Harvard: making predictive accounts of model representations by assuming a model behaves optimally (i.e., good old rational analysis from cogsci!)

thumb_up_off_alt330

chat_bubble_outline41

repeat18

shareShare

Cas (Stephen Casper)

@stephenlcasper

2 months ago

Pontificating about a system's 'intentions' doesn't shed any light on the technical problem of eliciting its capabilities. It just confuses people in a characteristically AI-safety-community way.

thumb_up_off_alt17

chat_bubble_outline2

repeat1

shareShare

Prime Intellect

@primeintellect

2 months ago

Introducing the Environments Hub RL environments are the key bottleneck to the next wave of AI progress, but big labs are locking them down We built a community platform for crowdsourcing open environments, so anyone can contribute to open-source AGI

thumb_up_off_alt1,1K

chat_bubble_outline83

repeat254

shareShare

Core Francisco Park

Core Francisco Park

Core Francisco Park

Puneesh Deora

Core Francisco Park

Daniel Wurgaft

Core Francisco Park

Core Francisco Park

Michael Albergo

Fenil Doshi

Rylan Schaeffer

Core Francisco Park

K Srinivas Rao

Core Francisco Park

Yongyi Yang

Ekdeep Singh Lubana

Cas (Stephen Casper)

Prime Intellect