Scott Condron (@_scottcondron) Twitter Tweets • TwiCopy

Scott Condron

@_scottcondron

+ Follow

Helping build AI/ML dev tools at @weights_biases. I post about machine learning, data visualisation, software tools.

ID: 982132042845401088

linkhttps://www.scottcondron.com/ calendar_today06-04-2018 05:45:00

2,2K Tweet

5,5K Takipçi

1,1K Takip Edilen

W&B Weave

@weave_wb

2 months ago

Your RL run just spiked at step 89! But, do you know why? We’re fixing that. Today we’re launching W&B Weave Traces to give you a step by step look into your agent’s decisions. This is the first drop from our fresh new integration with OpenPipe. More RL magic is incoming.

thumb_up_off_alt355

chat_bubble_outline2

repeat31

shareShare

Boris Dayma 🖍️

@borisdayma

2 months ago

Interesting Muon experiment 🤓 Learning rate of Adam parameters (embeddings/gains) does not matter so much (here from 1e-4 to 1e-2). It’s more about Muon LR

thumb_up_off_alt14

chat_bubble_outline1

repeat2

shareShare

Fred Jonsson

@enginoid

2 months ago

Shawn Lewis W&B Weave Weights & Biases have to say kudos on this. i added a bunch of weave.ops around the code to debug a perf issue in my GRPO rollouts. and i got flamegraphs!! and it's great for inspecting all the env interactions

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Scott Condron

@_scottcondron

2 months ago

Great to see people trying out using weave traces in W&B model training runs to inspect agent rollouts. Would love to chat to more people trying this out and to hear any other rollout visualizations you’d like to see on top of this

thumb_up_off_alt0

chat_bubble_outline1

repeat0

shareShare

Scott Condron

@_scottcondron

a month ago

my latest agent

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Kyle Corbitt

@corbtt

a month ago

🚀 Big launch from OpenPipe: We just launched Serverless RL — train agents faster and cheaper with zero infra headaches. Compared to running your own GPUs, Serverless RL is: - 40% cheaper - 28% faster wall‑clock - instantly deployed to prod via Weights & Biases Inference

🚀 Big launch from <a href="/OpenPipeAI/">OpenPipe</a>: We just launched Serverless RL — train agents faster and cheaper with zero infra headaches.

Compared to running your own GPUs, Serverless RL is:
- 40% cheaper
- 28% faster wall‑clock
- instantly deployed to prod via <a href="/weights_biases/">Weights & Biases</a> Inference

thumb_up_off_alt226

chat_bubble_outline6

repeat28

shareShare

Shawn Lewis

@shawnup

a month ago

Sept 8: CoreWeave acquires OpenPipe Oct 8: OpenPipe ships Serverless RL

thumb_up_off_alt41

chat_bubble_outline1

repeat6

shareShare

Weights & Biases

@weights_biases

a month ago

RL X-mas came early. 🎄 For too long, building powerful AI agents with Reinforcement Learning has been blocked by GPU scarcity and complex infrastructure. That ends today. Introducing Serverless RL from wandb, powered by CoreWeave! We're making RL accessible to all.

thumb_up_off_alt155

chat_bubble_outline9

repeat17

shareShare

Scott Condron

@_scottcondron

a month ago

Evals versus no evals is a pretty silly debate; the answer is always just enough evals. "Enough" means you’re getting a strong enough signal to make the next iteration worthwhile. If you can get enough signal by putting it in front of users and gathering implicit / explicit

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Scott Condron

@_scottcondron

a month ago

I appreciate the "Research use cases" section of OpenAI's Apps SDK. They've clearly learned that AI projects fail when teams don't: - start with clear user goals - prototype against real prompts - align scope before building tools It’s applicable to almost any AI app and a

I appreciate the "Research use cases" section of <a href="/OpenAI/">OpenAI</a>'s Apps SDK.
They've clearly learned that AI projects fail when teams don't:
- start with clear user goals
- prototype against real prompts
- align scope before building tools

It’s applicable to almost any AI app and a

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Shreya Shankar

@sh_reya

a month ago

100% agree with Andrew on the unreasonable effectiveness of error analysis, and that it’s a bit different for genAI and agents. Turns out there is a structured & well established framework to help with error analysis in gen AI—grounded theory! Hamel and I go into detail and do

thumb_up_off_alt170

chat_bubble_outline6

repeat17

shareShare

W&B Weave

@weave_wb

a month ago

Stop juggling tabs to test your prompts! 🥵 The W&B Weave Playground is your new home for iterating on and comparing LLMs. And did you know... you can now generate images right in the Playground? Just search "image" in the model dropdown!

thumb_up_off_alt6

chat_bubble_outline1

repeat3

shareShare

W&B Weave

@weave_wb

a month ago

The Princeton University University lab built their agent evaluation harness (HAL) using W&B Weave. They're leveraging Weave to automatically track, monitor, and unify telemetry across different LLM providers and frameworks for consistent, in-depth evaluation.

The <a href="/Princeton/">Princeton University</a> University lab built their agent evaluation harness (HAL) using W&B Weave.

They're leveraging Weave to automatically track, monitor, and unify telemetry across different LLM providers and frameworks for consistent, in-depth evaluation.

thumb_up_off_alt6

chat_bubble_outline1

repeat2

shareShare

Filippo Facioni

@tryhardfifi

a month ago

bro W&B Weave is fire

thumb_up_off_alt8

chat_bubble_outline2

repeat3

shareShare

Andrew Gabriel

@andrewgabriel27

a month ago

Filippo Facioni W&B Weave I did a project with weave and it’s really easy to use right, don’t know why there is no hype

thumb_up_off_alt9

chat_bubble_outline2

repeat3

shareShare

W&B Weave

@weave_wb

a month ago

New QoL update for wandb Weave. We've added Quick Filters for your evals. Now, you can use the filter dropdown to search by name or dataset. This makes it easier to identify comparable evaluations and analyze performance trends across datasets.

thumb_up_off_alt33

chat_bubble_outline0

repeat3

shareShare