Ryan Smith (@rnsmith49) Twitter Tweets • TwiCopy

Abir HARRASSE

@aharrasse1906

3 months ago

Paper alert 🚨: LLMs build a shared multilingual latent space for meaning, decoding into languages only later. 🌍 Performance gaps come from tokenizer bias & weaker late-layer circuits, not missing concepts. We show this mechanistically with Cross-Layer Transcoders. 🧵👇

thumb_up_off_alt55

chat_bubble_outline3

repeat14

shareShare

Ryan Smith

@rnsmith49

3 months ago

Help us build ARES, and maybe train some fun models too!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Martian

@withmartian

3 months ago

We’re excited for the first round of the Martian Interpretability Prize! The proposal deadline was this week, and we’ve had a ton of applications. Can’t wait to look through these and get back to everyone who applied. More rounds coming up 🙂

thumb_up_off_alt19

chat_bubble_outline3

repeat5

shareShare

Josh Greaves

@joshua_gre63805

3 months ago

We'll be presenting the ARES roadmap at office hours tomorrow at 2pm PT. If you're interested in agents | RL | interp and want to contribute to open-source send me a DM for more info. github.com/withmartian/ar…

thumb_up_off_alt31

chat_bubble_outline0

repeat6

shareShare

Shriyash Upadhyay

@shriyashku

3 months ago

Thinking of training an RL model specialized for OpenClaw🦞 using ARES. So it gets better performance and much lower token costs. Is this something folks would be interested in using? 100 likes and I put up an endpoint

thumb_up_off_alt14

chat_bubble_outline3

repeat2

shareShare

Ryan Smith

@rnsmith49

3 months ago

Really great to see the tooling here converging on an interface!

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Ryan Smith

@rnsmith49

3 months ago

ARES 🤝 Mech Interp We’re excited that you can now easily dig into model internals across long-horizon tasks with ARES! We’ve built some nice integrations with TransformerLens, with even better support for hooks coming soon!

thumb_up_off_alt9

chat_bubble_outline1

repeat0

shareShare

Ryan Smith

@rnsmith49

3 months ago

This is one of the biggest takeaways for me - model internals change over steps after interacting with the environment! I love this other figure Narmeen Oozeer made that shows this too - a matrix of cosine similarity between optimal steering vectors at each step:

This is one of the biggest takeaways for me - model internals change over steps after interacting with the environment! I love this other figure <a href="/Narmeen29013644/">Narmeen Oozeer</a> made that shows this too - a matrix of cosine similarity between optimal steering vectors at each step:

thumb_up_off_alt10

chat_bubble_outline1

repeat0

shareShare

Martian

@withmartian

3 months ago

A new ARES tutorial from Narmeen Oozeer: Getting started in long-horizon interp. When do agents fail to accurately model their environment? How de we fix them? And how can you run these experiments on your own machine?

thumb_up_off_alt21

chat_bubble_outline1

repeat3

shareShare

Ryan Smith

@rnsmith49

3 months ago

Verification is easier than Generation - the same is true for Code, and I am really excited to see the pipeline of: Code Review getting more robust evals → better code review tools → better coding tools

thumb_up_off_alt13

chat_bubble_outline0

repeat0

shareShare

Ryan Smith

@rnsmith49

2 months ago

Coding tools are getting way better, but also way more convincing to humans (or me at least). We definitely need to continue building robust evals for code review so we know we are keeping the "LGTM effect" in check

thumb_up_off_alt10

chat_bubble_outline1

repeat0

shareShare

Shriyash Upadhyay

@shriyashku

2 months ago

Our code review tracker caught the release of Claude Code Review before Anthropic announced it. Greptile v4 hit #1 on CRB. The tracker caught it before their announcement. The data is predicting something new from @Devin Review in the near future. Here's how. 🧵

thumb_up_off_alt21

chat_bubble_outline3

repeat5

shareShare

Ashley Zhang

@ashleyzhang110

2 months ago

I've been playing around with eval-ing AI code review tools at work. We track 22 different ones. Greptile V4 had the single biggest improvement I've ever measured. Recall increased 47% from 38.7 → 56.9%

thumb_up_off_alt60

chat_bubble_outline5

repeat9

shareShare

Fazl Barez

@fazlbarez

2 months ago

If this policy is not revoked, I won’t be reviewing/ACing for #NeurIPS Science requires open exchange of ideas! When participation gets shaped by geopolitics, it ends up reflecting power structures, not merit--narrows what science can be and powerful nations get full control!

thumb_up_off_alt249

chat_bubble_outline5

repeat19

shareShare

Ryan Smith

@rnsmith49

2 months ago

Personally, I love the shift in the industry of optimizing for two separate workflows - fast and iterative, vs slow and offline. And I'm always a sucker for seeing data back up intuition 😅

thumb_up_off_alt9

chat_bubble_outline1

repeat0

shareShare

Ryan Smith

@rnsmith49

a month ago

I unironically love the Lotka-Volterra equations, so yes I may spend my Wednesday morning diving into the code of an April Fools’ post

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

CodeRabbit

@coderabbitai

a month ago

Open Source Models you have been sleeping on! x.com/i/broadcasts/1…

thumb_up_off_alt18

chat_bubble_outline1

repeat4

shareShare

Ryan Smith

@rnsmith49

a month ago

I've been so excited to see Mech Interp techniques actually generalize to improving real-world applications, and it feels like we're finally at that point!

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare

Ryan Smith

@rnsmith49

a month ago

It's pretty wild to think that a year ago CLI-first code tools didn't really exist, and now here we are measuring the entire E2E effectiveness of LLMs in software development

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

CodeRabbit

@coderabbitai

24 days ago

Think OpenAI and Anthropic own the AI game? Open-source models already handle 50%+ of real production requests! New pod episode with our own Erfan & Shriyash Upadhyay from Martian on the DeepSeek moment, vendor lock-in risk, and the fine-tuning cheat code.

thumb_up_off_alt35

chat_bubble_outline2

repeat5

shareShare