@anthropicai : Our interpretability team recently released research that traced the thoughts of a large language model. Now we’re open-sourcing the method. Researchers can generate “attribution graphs” like those in our study, and explore them interactively. • TwiCopy

Anthropic

@anthropicai

+ Follow

We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant Claude at Claude.ai.

ID: 1353836358901501952

linkhttp://anthropic.com calendar_today25-01-2021 22:45:28

872 Tweet

515,515K Takipçi

35 Takip Edilen

Anthropic

@anthropicai

2 months ago

Our interpretability team recently released research that traced the thoughts of a large language model. Now we’re open-sourcing the method. Researchers can generate “attribution graphs” like those in our study, and explore them interactively.

thumb_up_off_alt4,4K

chat_bubble_outline103

repeat576

shareShare