Shawn Lewis (@shawnup) Twitter Tweets • TwiCopy

Shawn Lewis

9 months ago

This is the improvement Claude Code needed to be great. Just keep going! I don’t care how you do it or what the context looks like. Looking forward to trying it.

thumb_up_off_alt19

chat_bubble_outline1

repeat0

shareShare

Shawn Lewis

@shawnup

9 months ago

Hmm…. Should I do a swebench run with this?

thumb_up_off_alt19

chat_bubble_outline5

repeat1

shareShare

Shawn Lewis

@shawnup

8 months ago

Why am I debugging AI’s code instead AI debugging my code?

thumb_up_off_alt26

chat_bubble_outline5

repeat3

shareShare

Shawn Lewis

@shawnup

8 months ago

Does discouraging external linking on X make it a better model training corpus?

thumb_up_off_alt0

chat_bubble_outline1

repeat0

shareShare

Shawn Lewis

@shawnup

8 months ago

Legend

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

CoreWeave is the first cloud provider to submit MLPerf Inference v5.0 results for @NVIDIA GB200 GPUs, achieving a 4X per-chip performance improvement over H200 GPUs. We are committed to delivering the fastest and most efficient AI infrastructure. hubs.la/Q03fBklc0

thumb_up_off_alt113

chat_bubble_outline6

repeat17

shareShare

Bespoke Labs

@bespokelabsai

8 months ago

OpenAI’s o4 just showed that multi-turn tool use is a huge deal for AI agents. Today, we show how to do the same with your own agents, using RL and open-source models. We used GRPO on only 100 high quality questions from the BFCL benchmark, and post-trained a 7B Qwen model to

thumb_up_off_alt380

chat_bubble_outline21

repeat50

shareShare

Shawn Lewis

@shawnup

8 months ago

I got my first startup office space in SF from Aneel for free back in 2007. Highly recommended. He's an awesome host!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Scott Condron

@_scottcondron

7 months ago

New Evals API I’m excited to share a new API for logging evals with W&B Weave. EvaluationLogger - log_prediction - log_score - log_summary Our design goal for this API was to get out of your way and build the most flexible eval API out there, inspired by wandb.log, which our

thumb_up_off_alt16

chat_bubble_outline3

repeat8

shareShare

Kyle Corbitt

@corbtt

7 months ago

🚀 Meet ART·E—our open-source RL-trained email research agent that searches your inbox and answers questions more accurately, faster, and cheaper than o3. Let's go deeper on how we built it. 🧵

thumb_up_off_alt948

chat_bubble_outline38

repeat117

shareShare

Yiping Wang

@ypwang61

7 months ago

We only need ONE example for RLVR on LLMs to achieve significant improvement on math tasks! 📍RLVR with one training example can boost: - Qwen2.5-Math-1.5B: 36.0% → 73.6% - Qwen2.5-Math-7B: 51.0% → 79.2% on MATH500. 📄 Paper: arxiv.org/abs/2504.20571

thumb_up_off_alt413

chat_bubble_outline14

repeat84

shareShare

Chen Goldberg

@goldbergchen

7 months ago

We’ve officially completed our acquisition of Weights & Biases , and I couldn’t be more excited. Combining CoreWeave high-performance AI cloud with W&B’s incredible developer tools unlocks new levels of innovation for our customers. Together, we’re building the next-gen AI cloud

We’ve officially completed our acquisition of <a href="/weights_biases/">Weights & Biases</a> , and I couldn’t be more excited.
Combining <a href="/CoreWeave/">CoreWeave</a> high-performance AI cloud with W&B’s incredible developer tools unlocks new levels of innovation for our customers.
Together, we’re building the next-gen AI cloud

thumb_up_off_alt49

chat_bubble_outline4

repeat3

shareShare

Fastino

@fastinoai

7 months ago

BIG NEWS: Fastino raises $17.5M Seed to launch TLMs – Task-Specific Language Models that beat GPT on accuracy and latency. Led by jon chu at Khosla Ventures + joined by George K. Mathew at Insight Partners, agracias at @valorep, Scott Johnston (ex-Docker CEO), and Lukas Biewald (CEO of