Tim Scarfe (@ecsquendor) Twitter Tweets • TwiCopy

Kenneth Stanley

9 months ago

It’s become popular these days in AI to celebrate that we now have “two dimensions of scaling!” But having more than one dimension only highlights that intelligence is multidimensional and the number of dimensions is not necessarily only 2 either.

thumb_up_off_alt58

chat_bubble_outline3

repeat6

shareShare

Petar Veličković

@petarv_93

8 months ago

A great interview of Federico Barbero by Tim Scarfe (for Machine Learning Street Talk), discussing our NeurIPS'24 paper. Check it out to learn more about why Transformers need Glasses! 👓 youtube.com/watch?v=FAspMn…

thumb_up_off_alt68

chat_bubble_outline1

repeat16

shareShare

Maxwell Ramstead

@mjdramstead

8 months ago

I'm thrilled to share this awesome conversation on Machine Learning Street Talk, featuring Jeff Beck, the CEO of Noumenal Labs, and my favourite host Tim Scarfe Tim Scarfe

thumb_up_off_alt27

chat_bubble_outline2

repeat4

shareShare

Yuchen Jin

@yuchenj_uw

8 months ago

If Meta actually did this for Llama 4 training to maximize benchmark scores, it's fucked.

thumb_up_off_alt3,3K

chat_bubble_outline126

repeat214

shareShare

Stanislav Fort

@stanislavfort

7 months ago

Isn't the Strong Model Collapse paper basically impossible to be correct since synthetic data is a huge part of frontier model training already? > results show that even the smallest fraction of synthetic data (e.g., as little as 1% [...]) can still lead to model collapse ???

$Isn't the Strong Model Collapse paper basically impossible to be correct since synthetic data is a huge part of frontier model training already? > results show that even the smallest fraction of synthetic data (e.g., as little as 1% [...]) can still lead to model collapse ???$

thumb_up_off_alt432

chat_bubble_outline32

repeat26

shareShare

Transluce

@transluceai

7 months ago

We tested a pre-release version of o3 and found that it frequently fabricates actions it never took, and then elaborately justifies these actions when confronted. We were surprised, so we dug deeper 🔎🧵(1/) x.com/OpenAI/status/…

thumb_up_off_alt11,11K

chat_bubble_outline440

repeat1,1K

shareShare

Jack Cole

@mindsai_jack

7 months ago

ARC-AGI V2 is quite challenging. We're happy to be back at the top at 12.36! Mohamed Osman Michael Hodel Tufalabs ARC Prize

thumb_up_off_alt162

chat_bubble_outline9

repeat12

shareShare

Kenneth Stanley

@kenneth0stanley

7 months ago

Awesome to see a keynote on open-endedness at #ICLR - way to go Tim Rocktäschel ! You have the right message at the right time and I appreciate the callout in the abstract. I wish I was there to see this. Open-endedness is the next frontier for AI as the benchmark race loses its allure.

thumb_up_off_alt72

chat_bubble_outline4

repeat15

shareShare

Sara Hooker

@sarahookr

7 months ago

It is critical for scientific integrity that we trust our measure of progress. The lmarena.ai has become the go-to evaluation for AI progress. Our release today demonstrates the difficulty in maintaining fair evaluations on lmarena.ai, despite best intentions.

It is critical for scientific integrity that we trust our measure of progress.

The <a href="/lmarena_ai/">lmarena.ai</a> has become the go-to evaluation for AI progress.

Our release today demonstrates the difficulty in maintaining fair evaluations on <a href="/lmarena_ai/">lmarena.ai</a>, despite best intentions.

thumb_up_off_alt712

chat_bubble_outline21

repeat132

shareShare

Maxwell Ramstead

@mjdramstead

7 months ago

To all those interested in the free energy principle and active inference: I'm thrilled to announce that I will be hosting a monthly Ask Me Anything (AMA) session on the free energy principle, active inference, and Bayesian mechanics. The event will be open to all 1/2

thumb_up_off_alt98

chat_bubble_outline5

repeat15

shareShare

Sara Hooker

@sarahookr

6 months ago

Following release of our recent work, we have spent considerable time engaging with lmarena.ai over last week. The organizers had concerns about the correctness of our work on the reliability of chatbot arena rankings.

thumb_up_off_alt557

chat_bubble_outline11

repeat92

shareShare

Neel Nanda

@neelnanda5

6 months ago

After supervising 20+ papers, I have highly opinionated views on writing great ML papers. When I entered the field I found this all frustratingly opaque So I wrote a guide on turning research into high-quality papers with scientific integrity! Hopefully still useful for NeurIPS

thumb_up_off_alt1,1K

chat_bubble_outline18

repeat167

shareShare

Melanie Mitchell

@melmitchell1

6 months ago

I reviewed "These Strange New Minds: How AI Learned to Talk and What It Means" by Chris Summerfield. ⬇️

thumb_up_off_alt117

chat_bubble_outline5

repeat18

shareShare

Sakana AI

@sakanaailabs

6 months ago

Introducing The Darwin Gödel Machine: AI that improves itself by rewriting its own code sakana.ai/dgm The Darwin Gödel Machine (DGM) is a self-improving agent that can modify its own code. Inspired by evolution, we maintain an expanding lineage of agent variants,

thumb_up_off_alt2,2K

chat_bubble_outline104

repeat512

shareShare

François Chollet

@fchollet

6 months ago

Engineering is rarely the application of a well-understood theory. Most of the time it's a two-way dialogue, forcing theory to become more robust, more nuanced, or even to be discarded and rebuilt. But sometimes there's no theory at all, just a bag of poorly understood tricks

thumb_up_off_alt1,1K

chat_bubble_outline50

repeat217

shareShare

hardmaru

@hardmaru

6 months ago

Jürgen Schmidhuber and his lab always ship ahead of schedule 🚀✨

thumb_up_off_alt107

chat_bubble_outline3

repeat11

shareShare

vitrupo

@vitrupo

5 months ago

Terence Tao says today's AIs pass the eye test -- but fail miserably on the smell test. They generate proofs that look flawless. But the mistakes are subtle, and strangely inhuman. “There's a metaphorical mathematical smell.. it's not clear how to get AI to duplicate that.”

thumb_up_off_alt4,4K

chat_bubble_outline133

repeat682

shareShare

Melanie Mitchell

@melmitchell1

5 months ago

New paper: "Large Language Models & Emergence: A Complex Systems Perspective" (D. Krakauer, J. Krakauer, M. Mitchell). We look at claims of "emergent capabilities" & "emergent intelligence" in LLMs from perspective of what emergence means in complexity science. ⬇️

thumb_up_off_alt715

chat_bubble_outline30

repeat141

shareShare

Ndea

@ndea

5 months ago

New robotics paper that combines symbolic search + neural learning to build compositional models that generalize to new tasks. A neural grammar for a planning programming language.

thumb_up_off_alt33

chat_bubble_outline2

repeat2

shareShare

Andrew Ilyas

@andrew_ilyas

5 months ago

“How will my model behave if I change the training data?” Recent(-ish) work w/ Logan Engstrom: we nearly *perfectly* predict ML model behavior as a function of training data, saturating benchmarks for this problem (called “data attribution”).

“How will my model behave if I change the training data?”

Recent(-ish) work w/ <a href="/logan_engstrom/">Logan Engstrom</a>: we nearly *perfectly* predict ML model behavior as a function of training data, saturating benchmarks for this problem (called “data attribution”).

thumb_up_off_alt381

chat_bubble_outline10

repeat66

shareShare