Zach Furman (@furmanzach) Twitter Tweets • TwiCopy

Zach Furman

@furmanzach

+ Follow

Singular learning theory and AI alignment research. Previously embedded SWE, aerospace, and physics.

ID: 1591503649729036288

linkhttp://zachfurman.com calendar_today12-11-2022 18:50:39

12 Tweet

111 Followers

169 Following

Aran Komatsuzaki

@arankomatsuzaki

3 years ago

Eliciting Latent Predictions from Transformers with the Tuned Lens Analyzes transformers from the perspective of iterative inference, seeking to understand how model predictions are refined layer by layer. repo: github.com/AlignmentResea… abs: arxiv.org/abs/2303.08112

thumb_up_off_alt83

chat_bubble_outline3

repeat17

shareShare

Nora Belrose

@norabelrose

3 years ago

Ever wonder how a language model decides what to say next? Our method, the tuned lens (arxiv.org/abs/2303.08112), can trace an LM’s prediction as it develops from one layer to the next. It's more reliable and applies to more models than prior state-of-the-art. 🧵

thumb_up_off_alt893

chat_bubble_outline16

repeat169

shareShare

Daniel Murfet

@danielmurfet

3 years ago

Timaeus is a new research organization, dedicated to making fundamental breakthroughs in technical AI alignment using deep ideas from mathematics and the sciences. Led by Jesse Hoogland Consistently Candid Alex Stan van Wingerden and myself. lesswrong.com/posts/nN7bHuHZ… [1/n]

thumb_up_off_alt113

chat_bubble_outline1

repeat18

shareShare

Jesse Hoogland

@jesse_hoogland

2 years ago

1/8 How do transformers learn? In our new work, we find that transformers develop in-context learning in discrete stages that can be automatically discovered. 🧵 arxiv.org/abs/2402.02364 Joint work w/ george, Matthew Farrugia-Roberts, Liam Carroll, Susan Wei, Daniel Murfet

thumb_up_off_alt428

chat_bubble_outline3

repeat86

shareShare

george

@georgeyw_

2 years ago

1/ How do attention heads form? With our new approach, we show that attention heads have distinct developmental signatures. These signatures reveal how heads develop distinct functional roles specialized to different subsets of data. In the process, we discover a new circuit.

thumb_up_off_alt361

chat_bubble_outline4

repeat54

shareShare