matt turk (@turkmatthew) Twitter Tweets • TwiCopy

matt turk

@turkmatthew

+ Follow

Data Science @cleanlabAI. Prev: Investing & ML @goodwatercap, Quant/ML @coinbase & @goldmansachs, prop trading, EECS @ucberkeley

ID: 517916836

linkhttps://signal.nfx.com/investors/matt-turk calendar_today07-03-2012 20:35:17

1,1K Tweet

611 Takipçi

1,1K Takip Edilen

skooks

@skooookum

5 months ago

Claude spinning up a dozens of GPUs when I say "thanks" after he explains how turtles work

thumb_up_off_alt2,2K

chat_bubble_outline15

repeat99

shareShare

Machine Learning

@memoirs

5 months ago

Real-Time Evaluation Models for RAG: Who Detects Hallucinations Best?. arxiv.org/abs/2503.21157

thumb_up_off_alt8

chat_bubble_outline0

repeat5

shareShare

Evaluation models for RAG aim to detect incorrect responses in real-time, but can they actually without any ground-truth answers/labels? Just published: A benchmark across six RAG applications comparing popular Evaluation models like: LLM-as-a-Judge, Prometheus, Lynx, HHEM, TLM

thumb_up_off_alt9

chat_bubble_outline1

repeat2

shareShare

Blue Collar Burger

@bluecollarbklyn

5 months ago

Unfollowed

thumb_up_off_alt260,260K

chat_bubble_outline479

repeat6,6K

shareShare

matt turk

@turkmatthew

5 months ago

Partnership with MLflow - check it out if you’re trying to improve your LLM Evals

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Curtis G. Northcutt

@cgnorthcutt

5 months ago

Tomorrow I'm spilling the secrets as to how several Fortune 500 @cleanlabai customers are solving the hardest problem in AI -- producing accurate, compliant, safe fully automated AI Agent responses -- at the AI User Group Conference in SF. Stop by and get your hands dirty and

thumb_up_off_alt6

chat_bubble_outline0

repeat3

shareShare

matt turk

@turkmatthew

4 months ago

The best AI agent workflow I’ve used so far is from ManusAI

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Alice

@alicefromqueens

4 months ago

Remember when they were going to move Silicon Valley to Miami? How'd that work out?

thumb_up_off_alt2,2K

chat_bubble_outline118

repeat41

shareShare

matt turk

@turkmatthew

4 months ago

Love when the Pistons blow out the Knicks by -94 Google

Love when the Pistons blow out the Knicks by -94 <a href="/Google/">Google</a>

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

matt turk

@turkmatthew

4 months ago

If you use NVIDIA AI NeMo Guardrails for LLM app reliability, go ahead and try to integrate our Cleanlab Trustworthy Language Model. Developers can add additional safeguards to address hallucination and untrustworthy responses when building LLM-based applications.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

matt turk

@turkmatthew

4 months ago

We Cleanlab reproduced, contained, and fixed Cursor’s rogue AI support bot with automated AI Safety software—an incident that sits atop a growing list of customer support AI Agent meltdowns that cause serious damage to trust with customers. Reach out if you care about

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Cleanlab

@cleanlabai

4 months ago

Cleanlab is now integrated into langfuse.com's observability platform! We're adding real-time trust scores to LLM outputs to quickly surface the most problematic responses for Langfuse users.

Cleanlab is now integrated into <a href="/langfuse/">langfuse.com</a>'s observability platform!

We're adding real-time trust scores to LLM outputs to quickly surface the most problematic responses for Langfuse users.

thumb_up_off_alt13

chat_bubble_outline1

repeat2

shareShare

AWS Developers

@awsdevelopers

4 months ago

🔍 Unlock generative AI success with quality data! Join #AWS & Cleanlab for an exclusive workshop at SFO Gen AI Loft on May 9, 2025. Learn to build & scale production-ready AI solutions from experts. For developers & decision-makers. Register now! 👉 go.aws/3SeOxwU

🔍 Unlock generative AI success with quality data! Join #AWS & <a href="/CleanlabAI/">Cleanlab</a> for an exclusive workshop at SFO Gen AI Loft on May 9, 2025.

Learn to build & scale production-ready AI solutions from experts. For developers & decision-makers.

Register now! 👉 go.aws/3SeOxwU

thumb_up_off_alt6

chat_bubble_outline0

repeat6

shareShare

MLflow

@mlflow

4 months ago

Curious about how to systematically evaluate and improve the trustworthiness of your LLM applications? 🤔 Check out how Cleanlab's Trustworthy Language Models (TLM) integrates with #MLflow! TLM analyzes both prompts and responses to flag potentially untrustworthy outputs-no

Curious about how to systematically evaluate and improve the trustworthiness of your LLM applications? 🤔 Check out how <a href="/CleanlabAI/">Cleanlab</a>'s Trustworthy Language Models (TLM) integrates with #MLflow!

TLM analyzes both prompts and responses to flag potentially untrustworthy outputs-no

thumb_up_off_alt17

chat_bubble_outline0

repeat4

shareShare

matt turk

@turkmatthew

3 months ago

You can now use Cleanlab with LlamaIndex 🦙 to make your production AI agents trustworthy and actually root cause why certain responses are untrustworthy (knowledge gap/poor retrieval, bad data, hallucination, etc.)

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Alex 👋

@dubs408

3 months ago

Based on last preseason odds it's not even close too lmao Pacers +6600 15 Warriors +2800 '11 Mavs +2000 '19 Raptors +1850 '23 Nuggets +1800 '04 Pistons +1500

thumb_up_off_alt11,11K

chat_bubble_outline37

repeat300

shareShare

JT

@jiratickets

2 months ago

ever since I was a little boy I dreamed of configuring IAM polices for an enterprise AWS environment

thumb_up_off_alt7,7K

chat_bubble_outline89

repeat645

shareShare

LangChain

@langchainai

2 months ago

🛑Prevent Hallucinated Responses Our integration with Cleanlab allows developers to catch agent failures in realtime To make this more concrete - they put together a blog and a tutorial showing how to do this for a Customer Support agent Blog: cleanlab.ai/blog/prevent-h…

🛑Prevent Hallucinated Responses

Our integration with <a href="/CleanlabAI/">Cleanlab</a> allows developers to catch agent failures in realtime

To make this more concrete - they put together a blog and a tutorial showing how to do this for a Customer Support agent

Blog: cleanlab.ai/blog/prevent-h…

thumb_up_off_alt236

chat_bubble_outline1

repeat47

shareShare

david

@ghosttyped

2 months ago

As of midnight tonight, 2050 will be closer than 2000

thumb_up_off_alt1,1K

chat_bubble_outline13

repeat99

shareShare