Ian Webster (@iwebst) 's Twitter Profile
Ian Webster

@iwebst

building @Promptfoo (LLM security) + "curator of the world's largest digital dinosaur database"

ID: 1009245427

linkhttps://www.promptfoo.dev/ calendar_today13-12-2012 16:51:15

379 Tweet

2,2K Takipçi

405 Takip Edilen

Ian Webster (@iwebst) 's Twitter Profile Photo

Finally got around to publishing the largest per-parcel California property tax dataset - on Kaggle kaggle.com/datasets/iwebs… These are the numbers behind the CA property tax map, which still gets a surprising amount of traffic and data requests officialdata.org/ca-property-ta…

Ian Webster (@iwebst) 's Twitter Profile Photo

promptfoo has passed 250,000 evals + thousands of users from companies like Microsoft, Salesforce, Intel. Open source, developer-first, organic grass-fed LLM evals (choose the best prompt and model) If you are serious about deploying LLMs, check it out! promptfoo.dev

Ian Webster (@iwebst) 's Twitter Profile Photo

The Eta Aquariids peak this weekend, when the Earth passes through a dust cloud left by Halley's Comet meteorshowers.org

tobi lutke (@tobi) 's Twitter Profile Photo

For all our LLM (and many ML) projects at Shopify we standardized on promptfoo.dev for writing evals. That has caused a lot of great progress speedups. Highly recommended if you are looking for good eval system. Fun realization: TDD is alive and well in the ML world!

Ian Webster (@iwebst) 's Twitter Profile Photo

promptfoo's 0.62.0 changelog is very crowded for just 1 week of work. Shoutout to maltais (Shopify), Rohit Agarwal (Portkey), Peli de Halleux (Microsoft), @dlssrt, Michael D'Angelo and many others not on Twitter. Open source makes the world go round 💪

promptfoo's 0.62.0 changelog is very crowded for just 1 week of work.

Shoutout to <a href="/chrismaltais/">maltais</a> (<a href="/Shopify/">Shopify</a>), <a href="/jumbld/">Rohit Agarwal</a> (<a href="/PortkeyAI/">Portkey</a>), <a href="/pelikhan/">Peli de Halleux</a> (<a href="/Microsoft/">Microsoft</a>), @dlssrt, <a href="/dangelosaurus/">Michael D'Angelo</a> and many others not on Twitter.  Open source makes the world go round 💪
Ian Webster (@iwebst) 's Twitter Profile Photo

Just added automated LLM red teaming/vulnerability scanning to promptfoo! Uncovers app-specific jailbreaks and failure modes... Completely open source promptfoo.dev/llm-vulnerabil…

Just added automated LLM red teaming/vulnerability scanning to promptfoo!

Uncovers app-specific jailbreaks and failure modes...

Completely open source promptfoo.dev/llm-vulnerabil…
Jason Major (@jpmajor) 's Twitter Profile Photo

This is a picture of the 9.8 km-long moon Daphnis traveling inside the Keeler Gap in Saturn's A ring, captured with Cassini's narrow-angle camera 14 years ago on July 5, 2010

This is a picture of the 9.8 km-long moon Daphnis traveling inside the Keeler Gap in Saturn's A ring, captured with Cassini's narrow-angle camera 14 years ago on July 5, 2010
Ian Webster (@iwebst) 's Twitter Profile Photo

We all knew image generators can be jailbroken to generate violent and graphic content. It turns out the jailbreak process can be fully automated :o Write-up here (open-source) with examples from OpenAI's Dall-E promptfoo.dev/blog/jailbreak…

We all knew image generators can be jailbroken to generate violent and graphic content.

It turns out the jailbreak process can be fully automated :o

Write-up here (open-source) with examples from OpenAI's Dall-E promptfoo.dev/blog/jailbreak…
Derrick Harris (@derrickharris) 's Twitter Profile Photo

Democratizing Generative AI Red Teams a16z.com/podcast/securi… < Really good discussion with Ian Webster (of promptfoo ) and Anjney Midha 🇺🇸 about securing AI products at the app layer, rather than focusing on models.

Ian Webster (@iwebst) 's Twitter Profile Photo

Anthropic has been quietly publishing top notch content on LLM fundamentals. Lots of great examples of using Promptfoo for evals in this new course!

👩‍💻 Paige Bailey (@dynamicwebpaige) 's Twitter Profile Photo

❤️ Love seeing promptfoo + Google AI Studio Gemini 2.5 experiments in the wild! Check out this example benchmark that gauges models' ability to help people in need apply for SNAP benefits: propel.app/insights/build…

❤️ Love seeing <a href="/promptfoo/">promptfoo</a> + <a href="/googleaistudio/">Google AI Studio</a> Gemini 2.5 experiments in the wild!

Check out this example benchmark that gauges models' ability to help people in need apply for SNAP benefits:

propel.app/insights/build…