Varun Joshi (@varjoshi) 's Twitter Profile
Varun Joshi

@varjoshi

eng @PatronusAI | prev @substackinc, @awscloud, @uchicago

ID: 2354327713

calendar_today21-02-2014 06:00:23

5 Tweet

133 Followers

718 Following

PatronusAI (@patronusai) 's Twitter Profile Photo

1/ Introducing Lynx - the leading hallucination detection model 🚀👀 - Beats GPT-4o on hallucination tasks - Open source, open weights, open data - Excels in real-world domains like medicine and finance We are excited to launch Lynx with Day 1 integration partners: NVIDIA,

PatronusAI (@patronusai) 's Twitter Profile Photo

1/ Introducing the Patronus API: powerful AI evaluation models to accelerate your AI development 🚀 - 20% more accurate than ragas on hallucination detection - Beats Perspective and Llama Guard on safety tasks by 28% and 11% - Excels in practical domains like finance and

PatronusAI (@patronusai) 's Twitter Profile Photo

1/ Ever tried to remember the name of a movie you’ve seen – you can picture the scenes clearly, but the movie name won’t come to you? Introducing BLUR: the first agent benchmark for tip-of-the-tongue search and reasoning 🔥 We benchmarked SOTA agents and found that the

1/ Ever tried to remember the name of a movie you’ve seen – you can picture the scenes clearly, but the movie name won’t come to you?

Introducing BLUR: the first agent benchmark for tip-of-the-tongue search and reasoning 🔥

We benchmarked SOTA agents and found that the
PatronusAI (@patronusai) 's Twitter Profile Photo

1/ 🔥🔥 Big news: We’re launching Percival, the first AI agent that can evaluate and fix other AI agents! 🤖 Percival is an evaluation agent that doesn’t just detect failures in agent traces — it can fix them. Percival outperformed SOTA LLMs by 2.9x on the TRAIL dataset,