Jason Lopatecki (@jason_lopatecki) 's Twitter Profile
Jason Lopatecki

@jason_lopatecki

Founder/CEO @arizeai, entrepreneur, 2x founder, passion for ML & building companies - Berkeley EECS

ID: 14875695

calendar_today23-05-2008 01:06:10

1,1K Tweet

508 Followers

347 Following

Mikyo (@mikeldking) 's Twitter Profile Photo

📈 arize-phoenix now has project dashboards! In the latest release Arize AI Phoenix comes with a dedicated project dashboard with: 📈 Trace latency and errors 📈 Latency Quantiles 📈 Annotation Scores Timeseries 📈 Cost over Time by token type 📊 Top Models by Cost 📊 Token

Arize AI (@arizeai) 's Twitter Profile Photo

LLM observability provides structured visibility into how LLMs and agents behave, from individual spans to full multi-turn sessions. With the right instrumentation, teams can improve systems with the same rigor they apply to conventional software. bit.ly/4mgFOay

LLM observability provides structured visibility into how LLMs and agents behave, from individual spans to full multi-turn sessions. With the right instrumentation, teams can improve systems with the same rigor they apply to conventional software.  bit.ly/4mgFOay
Aparna Dhinakaran (@aparnadhinak) 's Twitter Profile Photo

Claude Code's 100K tokens feels infinite; your weekly cap isn't. Now that Anthropic’s weekly rate limits are live, managing context is no longer optional. In our traces, long sessions reliably led to higher costs, slower completions, and more drift in output behavior. With too

Claude Code's 100K tokens feels infinite; your weekly cap isn't.

Now that Anthropic’s weekly rate limits are live, managing context is no longer optional.

In our traces, long sessions reliably led to higher costs, slower completions, and more drift in output behavior.

With too
sanjana (@sanjanayed) 's Twitter Profile Photo

Prebuilt evals don’t always cut it - at some point, you’ll need to build your own LLM evaluator from the ground up, tailored to your exact use case. This can be tricky if you aren't sure where to start. My latest tutorial covers this! We being by building a benchmark dataset

Arize AI (@arizeai) 's Twitter Profile Photo

Trace your Dify apps in Arize AX! Get deep visibility into tool + agent calls, session flows, and token usage + errors. Setup in seconds: enter your Arize Space ID and API Key in Dify’s Monitoring tab, and start capturing detailed, real-time traces. bit.ly/3HbvjGE

Trace your <a href="/dify_ai/">Dify</a> apps in Arize AX! Get deep visibility into tool + agent calls, session flows, and token usage + errors. Setup in seconds: enter your Arize Space ID and API Key in Dify’s Monitoring tab, and start capturing detailed, real-time traces. bit.ly/3HbvjGE
arize-phoenix (@arizephoenix) 's Twitter Profile Photo

Plug arize-phoenix tracing into any Dify app to trace every part of your AI workflow, including: 🧠 LLM messages — capture the full conversation & decision-making process 🛠️ Tools — monitor how tools are used within your workflows 📊 Token usage + errors — monitor

Arize AI (@arizeai) 's Twitter Profile Photo

🔥new @aws blog just dropped covering how to observe and evaluate AI agentic workflows with Strands Agents SDK and Arize AX! Karan Singh go.aws/4l4JiMB

🔥new @aws blog just dropped covering how to observe and evaluate AI agentic workflows with Strands Agents SDK and Arize AX! <a href="/karan5ingh/">Karan Singh</a> 
go.aws/4l4JiMB
sanjana (@sanjanayed) 's Twitter Profile Photo

Putting together a benchmark dataset to evaluate LLM outputs is underrated. Without ground truth or clear evals, you're flying blind. Not just pass/fail - you want structured, repeatable comparisons across prompts, models, and strategies. Earlier this week, I put out a

Mikyo (@mikeldking) 's Twitter Profile Photo

arize-phoenix 11.18 comes with lots of user request fixes! - Day 0 support for Anthropic Claude Opus 4.1 - Support for retention policy configuration via Helm - Basic support for air-gapped deployments - REST api for deleting spans for redaction - Typescript evals now

sanjana (@sanjanayed) 's Twitter Profile Photo

Loving these updates 👏 Day 0 Claude Opus 4.1 support, air-gapped deploys, redaction APIs, SpringAI support... and the UI polish too?!?! OSS done right. Focused, fast, and actually listening to users🔥

sanjana (@sanjanayed) 's Twitter Profile Photo

Span-level evaluations are powerful, but they only show you part of the picture. To really understand how your AI performs in real conversations, you have to think in sessions. A session captures the full back-and-forth between a user and your app. It reflects how people

Arize AI (@arizeai) 's Twitter Profile Photo

Packed the AWS Loft with builders last night w/ @CrewAIInc & Amazon Web Services—diving into tracing & evals for reliable AI agents. Thanks to João Moura, Karan Singh & Jason Lopatecki for a convo on where agent infra’s headed. Missed it? We’re back Aug 28: bit.ly/4m6YB8M

Packed the AWS Loft with builders last night w/ @CrewAIInc &amp; <a href="/awscloud/">Amazon Web Services</a>—diving into tracing &amp; evals for reliable AI agents.

Thanks to <a href="/joaomdmoura/">João Moura</a>, Karan Singh &amp; <a href="/jason_lopatecki/">Jason Lopatecki</a> for a convo on where agent infra’s headed.

Missed it? We’re back Aug 28: bit.ly/4m6YB8M
Mikyo (@mikeldking) 's Twitter Profile Photo

Just hit this issue with Claude not being able to use a MCP server talking to localhost. stackoverflow.com/questions/7950… Feels like there was a secret security issue... Am I wrong?

Mikyo (@mikeldking) 's Twitter Profile Photo

I just asked arize-phoenix MCP to figure out why my experiment went wrong and it nailed it first try. 🎯 Error analysis with LLM in the loop. I hate hyperbole but it did feel pretty magical. It does feel like a signal of a new wave: UIs designed for humans are still important,

Arize AI (@arizeai) 's Twitter Profile Photo

Last month was huge for Arize AX; we shipped: Prompt Learning, Alyx skills in your IDE, Tracing Assistant, session-level & trajectory evals, OpenInference Java, and a lot more! Dive into the July product update 👇 bit.ly/46M8Tq4

Last month was huge for Arize AX; we shipped: Prompt Learning, Alyx skills in your IDE, Tracing Assistant, session-level &amp; trajectory evals, OpenInference Java, and a lot more! Dive into the July product update 👇 bit.ly/46M8Tq4