Freddie Vargus (@freddie_v4) Twitter Tweets • TwiCopy

Freddie Vargus

@freddie_v4

+ Follow

CTO & co-founder @quotientai
Research @cohere_labs
—
past: evals @github Copilot, data @quantopian
—
Tico 🇨🇷🇺🇸

ID: 614740050

linkhttps://github.com/quotient-ai/judges calendar_today21-06-2012 23:20:04

524 Tweet

799 Followers

1,1K Following

Julia Neagu

@juliaaneagu

6 months ago

Most teams only find out their AI is broken when someone complains or churns. Your agents shouldn’t fail silently. We’re launching Quotient AI Detections: a system to catch agent mistakes, identify how they happened, and automatically fix them.

thumb_up_off_alt12

chat_bubble_outline1

repeat5

shareShare

Julia Neagu

@juliaaneagu

6 months ago

If you're building AI apps and flying blind, we can help. → Sign up: app.quotientai.co → Grab $250 in credits with the ElevenLabs AI Engineer Pack: aiengineerpack.com → Join our Discord: discord.com/invite/YeJzANp… Let us help you understand how your agents fail,

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Julia Neagu

@juliaaneagu

6 months ago

We just launched Quotient AI Detections: the first system that helps teams catch AI failures before their users do. As part of the launch, we partnered with ElevenLabs to offer coupons through the AI Engineer Pack: → 1,000,000 extra logs → 10,000 free detections → $250+

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

Mingxuan (Aldous) Li

@itea1001

6 months ago

HypoEval evaluators (github.com/ChicagoHAI/Hyp…) are now incorporated into judges from Quotient AI — check it out at github.com/quotient-ai/ju…!

thumb_up_off_alt4

chat_bubble_outline0

repeat4

shareShare

Julia Neagu

@juliaaneagu

6 months ago

detections go brrr One week in, Quotient AI Detections has processed 20M+ tokens, analyzed tens of thousands of logs, and caught thousands of hallucinations across real AI production apps. Still a long way to go, but we're committed to giving builders SOTA AI monitoring.

detections go brrr

One week in, <a href="/QuotientAI/">Quotient AI</a> Detections has processed 20M+ tokens, analyzed tens of thousands of logs, and caught thousands of hallucinations across real AI production apps.

Still a long way to go, but we're committed to giving builders SOTA AI monitoring.

thumb_up_off_alt8

chat_bubble_outline1

repeat2

shareShare

Freddie Vargus

@freddie_v4

6 months ago

referring to Raichu as “fine-tuned Pikachu” from now on

thumb_up_off_alt2

chat_bubble_outline1

repeat0

shareShare

Freddie Vargus

@freddie_v4

6 months ago

pretty cool for our tiny country 🇨🇷

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Freddie Vargus

@freddie_v4

6 months ago

headed to SF for AI Engineer World’s Fair! Who wants to meet up? Who should I meet? Will be around until Thursday evening

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

Julia Neagu

@juliaaneagu

5 months ago

it was a pleasure speaking at AI Engineer with Maitar Asher 🎗️ from tavily and Deanna Emery from Quotient AI 🫡

it was a pleasure speaking at <a href="/aiDotEngineer/">AI Engineer</a> with <a href="/maitarasher/">Maitar Asher 🎗️</a> from <a href="/tavilyai/">tavily</a> and <a href="/DeannaLEmery/">Deanna Emery</a> from <a href="/QuotientAI/">Quotient AI</a> 🫡

thumb_up_off_alt13

chat_bubble_outline3

repeat3

shareShare

Julia Neagu

@juliaaneagu

5 months ago

retrieval + search track = best vibes AI Engineer ft Maitar Asher 🎗️ Deanna Emery Jerry Liu tavily Quotient AI LlamaIndex 🦙

retrieval + search track = best vibes <a href="/aiDotEngineer/">AI Engineer</a> ft <a href="/maitarasher/">Maitar Asher 🎗️</a> <a href="/DeannaLEmery/">Deanna Emery</a> <a href="/jerryjliu0/">Jerry Liu</a> <a href="/tavilyai/">tavily</a> <a href="/QuotientAI/">Quotient AI</a> <a href="/llama_index/">LlamaIndex 🦙</a>

thumb_up_off_alt11

chat_bubble_outline1

repeat4

shareShare

John Berryman

@jnbrymn

5 months ago

You probably knew you could turn an LLM into a classifier. But the basic approach returns a hard classification – yes or no; good or bad. In this post I'll show you a simple technique to make your LLM classifiers more nuanced – they tell you HOW good or HOW bad something is. 👇

thumb_up_off_alt5

chat_bubble_outline1

repeat3

shareShare

Freddie Vargus

@freddie_v4

5 months ago

prayers out to all the infra teams right now. thought it was just gcp but now other clouds. brazy

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

Freddie Vargus

@freddie_v4

5 months ago

so who’s the bad node in the dependency graph right now? supabase, aws, gcp, cursor, cloudflare, azure… surely something upstream?

thumb_up_off_alt3

chat_bubble_outline2

repeat0

shareShare

Freddie Vargus

@freddie_v4

5 months ago

it’s good everyone here can share the misery of the internet being broken. hopefully this doesn’t jinx it

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Julia Neagu

@juliaaneagu

5 months ago

“You want your model hitting milestones, not minefields.” Most AI eval talk is hand-wavy. This isn’t. Freddie Vargus (Quotient AI CTO) gets into the weeds: how to actually test tool use, avoid minefields, and build agents that don’t break. Check out the recording👇

thumb_up_off_alt6

chat_bubble_outline0

repeat5

shareShare