Transluce (@transluceai) 's Twitter Profile
Transluce

@transluceai

Open and scalable technology for understanding AI systems.

ID: 1844990000754196482

linkhttp://transluce.org calendar_today12-10-2024 06:34:33

90 Tweet

7,7K Followers

10 Following

Sayash Kapoor (@sayashk) 's Twitter Profile Photo

Agent benchmarks lose *most* of their resolution because we throw out the logs and only look at accuracy. I’m very excited that HAL is incorporating Transluce’s Docent to analyze agent logs in depth. Peter’s thread is a simple example of the type of analysis this enables,