Priyan Jindal (@priyanjindal) 's Twitter Profile
Priyan Jindal

@priyanjindal

AI Engineer at Arize AI

ID: 1914416021731205121

calendar_today21-04-2025 20:29:02

2 Tweet

3 Takipçi

8 Takip Edilen

Priyan Jindal (@priyanjindal) 's Twitter Profile Photo

🚀 Just released a lightweight, end-to-end demo for self-hosting Phoenix + your app on AWS with CloudFormation! 🔧 Leverage our pre-built CFN templates to spin up VPCs, ECS/Fargate, Secrets Manager, ALBs—and stream OpenTelemetry traces into Phoenix in minutes. Dive in 👇

🚀 Just released a lightweight, end-to-end demo for self-hosting Phoenix + your app on AWS with CloudFormation! 🔧 Leverage our pre-built CFN templates to spin up VPCs, ECS/Fargate, Secrets Manager, ALBs—and stream OpenTelemetry traces into Phoenix in minutes. Dive in 👇
Priyan Jindal (@priyanjindal) 's Twitter Profile Photo

o3 plays pokémon! only took it 400 hours. how could you do it faster? @lukasagross from OpenAI talks about best practices for building functional multi agent systems. Arize AI #arizeobserve

o3 plays pokémon! only took it 400 hours. how could you do it faster? @lukasagross from OpenAI talks about best practices for building functional multi agent systems. <a href="/arizeai/">Arize AI</a> #arizeobserve
Mikyo (@mikeldking) 's Twitter Profile Photo

🔧 arize-phoenix mcp gets phoenix-support tool for Cursor / Anthropic Claude / windsurf ! You now can click the add to cursor button on phoenix and get a continuously updating MCP server config directly integrated into your IDE. @arizeai/[email protected] also comes

Aman Khan (@_amankhan) 's Twitter Profile Photo

It was pretty amazing to have a small role on this one and see the team execute at such a detailed level - shoutout to Andy Dai (researcher), Julia Gomes, Priyan Jindal Jason Lopatecki and Aparna Dhinakaran for rigorously testing an early idea that made sense intuitively and

Priyan Jindal (@priyanjindal) 's Twitter Profile Photo

Can’t stop talking about prompt learning at events lately. It’s awesome seeing people light up when they realize their agents can optimize themselves through autonomous prompt updates. Big thank you to Rootly and Google DeepMind for putting together this incredible event!!

Can’t stop talking about prompt learning at events lately. It’s awesome seeing people light up when they realize their agents can optimize themselves through autonomous prompt updates.

Big thank you to <a href="/rootlyhq/">Rootly</a> and <a href="/GoogleDeepMind/">Google DeepMind</a> for putting together this incredible event!!
Priyan Jindal (@priyanjindal) 's Twitter Profile Photo

awesome. evals are crucial when building agents, the same way testing is crucial when building software. more education in this space = more people building better evals.

Priyan Jindal (@priyanjindal) 's Twitter Profile Photo

I worked on a lot of the eval building and experimentation for this new Prompt Learning release - feel free to ask me anything!

arize-phoenix (@arizephoenix) 's Twitter Profile Photo

LLM pipelines and agents fail in complex ways. This isn't just limited to bad generations. It includes broken retrievals, inefficient prompts, or latency spikes that are hard to trace. With so much changing across the space, we figured it was time for a fresh LLM Ops tutorial

LLM pipelines and agents fail in complex ways. This isn't just limited to bad generations. It includes broken retrievals, inefficient prompts, or latency spikes that are hard to trace.  

With so much changing across the space, we figured it was time for a fresh LLM Ops tutorial
Priyan Jindal (@priyanjindal) 's Twitter Profile Photo

was testing my MMLU setup with gpt-4o-mini and accidentally only passed it the first multiple-choice option for every question. it still scored ~70%. maybe MMLU leaked into gpt-4o-mini training data, but still interesting to see Used Phoenix to run MMLU

was testing my MMLU setup with gpt-4o-mini and accidentally only passed it the first multiple-choice option for every question. it still scored ~70%. 

maybe MMLU leaked into gpt-4o-mini training data, but still interesting to see

Used Phoenix to run MMLU
Priyan Jindal (@priyanjindal) 's Twitter Profile Photo

x.com/aparnadhinak/s… My biggest takeaways from our awesome results: - Even coding agents, with intricate architectures and tools, can see HUGE improvements just through optimizing their prompts! - Writing rules for coding agents also requires evals. Otherwise, how do we know