Johan Gras (@gras_johan) 's Twitter Profile
Johan Gras

@gras_johan

📈 Training billion parameter models @ G-Research | 📱Prev. MLE @ Arm

ID: 1114183457813467138

linkhttps://www.linkedin.com/in/johan-gras/ calendar_today05-04-2019 15:10:11

75 Tweet

26 Followers

256 Following

hardmaru (@hardmaru) 's Twitter Profile Photo

Inference-Time Scaling and Collective Intelligence for Frontier AI sakana.ai/ab-mcts/ We developed AB-MCTS, a new inference-time scaling algorithm that enables multiple frontier AI models to cooperate, achieving promising initial results on the ARC-AGI-2 benchmark.

Jason Wei (@_jasonwei) 's Twitter Profile Photo

New blog post about asymmetry of verification and "verifier's law": jasonwei.net/blog/asymmetry… Asymmetry of verification–the idea that some tasks are much easier to verify than to solve–is becoming an important idea as we have RL that finally works generally. Great examples of

New blog post about asymmetry of verification and "verifier's law": jasonwei.net/blog/asymmetry…

Asymmetry of verification–the idea that some tasks are much easier to verify than to solve–is becoming an important idea as we have RL that finally works generally.

Great examples of
Irina Rish (@irinarish) 's Twitter Profile Photo

Truly exciting achievements - current frontier AI models would be probably considered AGI 10 years ago, but AI goalposts always keep moving, and critics always downplay the achievements and emphasize imperfections (same old, same old :)

Dimitris Papailiopoulos (@dimitrispapail) 's Twitter Profile Photo

Is LLM use finally making me less capable? I started using LLMs three years ago for text and code gen. Now, I use several of them, for a ton more things. In fact, I feel like I use them for a huge fraction of the cognitive tasks that I perform that can be described in text.

Is LLM use finally making me less capable?

I started using LLMs three years ago for text and code gen. Now, I use several of them, for a ton more things. 

In fact, I feel like I use them for a huge fraction of the cognitive tasks that I perform that can be described in text.
Anthropic (@anthropicai) 's Twitter Profile Photo

New Anthropic research: Persona vectors. Language models sometimes go haywire and slip into weird and unsettling personas. Why? In a new paper, we find “persona vectors"—neural activity patterns controlling traits like evil, sycophancy, or hallucination.

New Anthropic research: Persona vectors.

Language models sometimes go haywire and slip into weird and unsettling personas. Why? In a new paper, we find “persona vectors"—neural activity patterns controlling traits like evil, sycophancy, or hallucination.
François Chollet (@fchollet) 's Twitter Profile Photo

The proprietary frontier models of today are ephemeral artifacts. Essentially very expensive sandcastles. Destined to be washed away by the rising tide of open source replication (first) and algorithmic disruption (later).

j⧉nus (@repligate) 's Twitter Profile Photo

HOW INFORMATION FLOWS THROUGH TRANSFORMERS Because I've looked at those "transformers explained" pages and they really suck at explaining. There are two distinct information highways in the transformer architecture: - The residual stream (black arrows): Flows vertically through

HOW INFORMATION FLOWS THROUGH TRANSFORMERS
Because I've looked at those "transformers explained" pages and they really suck at explaining.

There are two distinct information highways in the transformer architecture: 
- The residual stream (black arrows): Flows vertically through
elvis (@omarsar0) 's Twitter Profile Photo

RL done right is no joke! The most interesting AI paper I read this week. It trains a top minimal single-agent model for deep research. Great example of simple RL-optimized single agents beating complex multi-agent scaffolds. Now let's break it down:

RL done right is no joke!

The most interesting AI paper I read this week.

It trains a top minimal single-agent model for deep research.

Great example of simple RL-optimized single agents beating complex multi-agent scaffolds.

Now let's break it down:
Nathan Lambert (@natolambert) 's Twitter Profile Photo

Thinking, Searching, and Acting A reflection on reasoning models. It's easy to fixate on the "thinking" that gave reasoning models their name, but just over a year out from o1-preview's release by OpenAI, the core primitives that make up models today has expanded. Searching and

Thinking, Searching, and Acting
A reflection on reasoning models. 

It's easy to fixate on the "thinking" that gave reasoning models their name, but just over a year out from o1-preview's release by OpenAI, the core primitives that make up models today has expanded. Searching and
Gabriel Synnaeve (@syhw) 's Twitter Profile Photo

(🧵) Today, we release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models. ai.meta.com/research/publi…

Sakana AI (@sakanaailabs) 's Twitter Profile Photo

We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency. Blog: sakana.ai/shinka-evolve/ Code: github.com/SakanaAI/Shink… Like AlphaEvolve and its variants, our framework leverages LLMs to

Johan Gras (@gras_johan) 's Twitter Profile Photo

The biggest AI skeptics I meet? SWEs and quants who use these models heavily every day yet insist AGI is sci-fi and progress is stalling. Imo they're either: - Failing to Understand the Exponential - Using Opus/GPT-5, but without proper context and scaffolding - In denial about

Anthropic (@anthropicai) 's Twitter Profile Photo

It’s called Petri: Parallel Exploration Tool for Risky Interactions. It uses automated agents to audit models across diverse scenarios. Describe a scenario, and Petri handles the environment simulation, conversations, and analyses in minutes. Read more: anthropic.com/research/petri…

Ross Taylor (@rosstaylor90) 's Twitter Profile Photo

RL is not enough. It only reaches its potential when combined with other ideas. The most famous example is AlphaZero. RL was combined with self-play which created an implicit task curriculum that evolved through training. This is very different from many RL datasets for LLMs

Elliot Arledge (@elliotarledge) 's Twitter Profile Photo

Your starting point for uncovering how state-of-the-art reasoning models are trained at frontier labs. Keyword "starting point".

Your starting point for uncovering how state-of-the-art reasoning models are trained at frontier labs. Keyword "starting point".