Priyan Jindal (@priyanjindal) Twitter Tweets • TwiCopy

Priyan Jindal

8 months ago

🚀 Just released a lightweight, end-to-end demo for self-hosting Phoenix + your app on AWS with CloudFormation! 🔧 Leverage our pre-built CFN templates to spin up VPCs, ECS/Fargate, Secrets Manager, ALBs—and stream OpenTelemetry traces into Phoenix in minutes. Dive in 👇

thumb_up_off_alt8

chat_bubble_outline1

repeat2

shareShare

Priyan Jindal

@priyanjindal

6 months ago

o3 plays pokémon! only took it 400 hours. how could you do it faster? @lukasagross from OpenAI talks about best practices for building functional multi agent systems. Arize AI #arizeobserve

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Mikyo

@mikeldking

5 months ago

🔧 arize-phoenix mcp gets phoenix-support tool for Cursor / Anthropic Claude / windsurf ! You now can click the add to cursor button on phoenix and get a continuously updating MCP server config directly integrated into your IDE. @arizeai/[email protected] also comes

thumb_up_off_alt16

chat_bubble_outline0

repeat7

shareShare

Aman Khan

@_amankhan

5 months ago

It was pretty amazing to have a small role on this one and see the team execute at such a detailed level - shoutout to Andy Dai (researcher), Julia Gomes, Priyan Jindal Jason Lopatecki and Aparna Dhinakaran for rigorously testing an early idea that made sense intuitively and

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Priyan Jindal

@priyanjindal

5 months ago

Can’t stop talking about prompt learning at events lately. It’s awesome seeing people light up when they realize their agents can optimize themselves through autonomous prompt updates. Big thank you to Rootly and Google DeepMind for putting together this incredible event!!

thumb_up_off_alt16

chat_bubble_outline0

repeat3

shareShare

Priyan Jindal

@priyanjindal

4 months ago

awesome. evals are crucial when building agents, the same way testing is crucial when building software. more education in this space = more people building better evals.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Priyan Jindal

@priyanjindal

4 months ago

I worked on a lot of the eval building and experimentation for this new Prompt Learning release - feel free to ask me anything!

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Priyan Jindal

@priyanjindal

4 months ago

Wanna see how Claude Code works? We got it all 😎😎😎😎😎😎

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

arize-phoenix

@arizephoenix

2 months ago

LLM pipelines and agents fail in complex ways. This isn't just limited to bad generations. It includes broken retrievals, inefficient prompts, or latency spikes that are hard to trace. With so much changing across the space, we figured it was time for a fresh LLM Ops tutorial

thumb_up_off_alt4

chat_bubble_outline1

repeat2

shareShare

Priyan Jindal

@priyanjindal

2 months ago

was testing my MMLU setup with gpt-4o-mini and accidentally only passed it the first multiple-choice option for every question. it still scored ~70%. maybe MMLU leaked into gpt-4o-mini training data, but still interesting to see Used Phoenix to run MMLU

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Priyan Jindal

@priyanjindal

2 months ago

x.com/aparnadhinak/s… My biggest takeaways from our awesome results: - Even coding agents, with intricate architectures and tools, can see HUGE improvements just through optimizing their prompts! - Writing rules for coding agents also requires evals. Otherwise, how do we know

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare