Guillermo Barbadillo (@guille_bar) 's Twitter Profile
Guillermo Barbadillo

@guille_bar

In a quest to understand intelligence

Hablando de IA en español en la TERTULia: ironbar.github.io/tertulia_intel…

ID: 961945656544948225

linkhttps://www.linkedin.com/in/guillermobarbadillo/ calendar_today09-02-2018 12:51:31

452 Tweet

1,1K Followers

237 Following

Kyle Corbitt (@corbtt) 's Twitter Profile Photo

Big news: we've figured out how to make a *universal* reward function that lets you apply RL to any agent with: - no labeled data - no hand-crafted reward functions - no human feedback! A 🧵 on RULER

Guillermo Barbadillo (@guille_bar) 's Twitter Profile Photo

Nice paper that in my opinion goes in the right direction to solve ARC. It generates python code to tackle the ARC tasks and combines search and learning in a virtuous cycle. I have summarized the results in the following plot.

Nice paper that in my opinion goes in the right direction to solve ARC. It generates python code to tackle the ARC tasks and combines search and learning in a virtuous cycle. I have summarized the results in the following plot.
Guillermo Barbadillo (@guille_bar) 's Twitter Profile Photo

As far as I understand, this is another case of test-time training, since they use example pairs from both the training and evaluation sets. I'm not sure if the hierarchical architecture is necessary, or we could get similar results with other models.

Guillermo Barbadillo (@guille_bar) 's Twitter Profile Photo

Giotto.ai is the first team to score above 20% on the ARC25 challenge. Congratulations! We're still far from the 85% goal, but there's time left since the competition ends in November.

Cristóbal Valenzuela (@c_valenzuelab) 's Twitter Profile Photo

Really nice demo of what Runway Aleph can do for complex changes in environments while adding accurate dynamic elements like snow on the shoulders or splashing water as the characters move.

Peter Gostev (@petergostev) 's Twitter Profile Photo

I quite like how well ARC Prize shows the distribution of GPT-5 variant capabilities, from 1.5% (GPT-5 Nano, Minimal) to 65.7% (GPT-5 High). Some other things that seem interesting: - 'Thinking' really matters for GPT-5: 6% for 'Minimal' to 65.7% for 'High'. The difference is

I quite like how well <a href="/arcprize/">ARC Prize</a> shows the distribution of GPT-5 variant capabilities, from 1.5% (GPT-5 Nano, Minimal) to 65.7% (GPT-5 High).

Some other things that seem interesting:
 - 'Thinking' really matters for GPT-5: 6% for 'Minimal' to 65.7% for 'High'. The difference is
Guillermo Barbadillo (@guille_bar) 's Twitter Profile Photo

Is a masked diffusion model one of the secrets behind the best score on ARC-AGI-2 so far? Or are they just trolling us :) Diffusion models might have an edge over autoregressive ones since they can capture a more global view of the grids.

Is a masked diffusion model one of the secrets behind the best score on ARC-AGI-2 so far? Or are they just trolling us :)

Diffusion models might have an edge over autoregressive ones since they can capture a more global view of the grids.
Luma AI (@lumalabsai) 's Twitter Profile Photo

This is Ray3. The world’s first reasoning video model, and the first to generate studio-grade HDR. Now with an all-new Draft Mode for rapid iteration in creative workflows, and state of the art physics and consistency. Available now for free in Dream Machine.