Arya (@aryagxr) 's Twitter Profile
Arya

@aryagxr

22 | engineering | 📚🧎‍♀️

ID: 818438092222709762

linkhttp://aryagxr.com calendar_today09-01-2017 12:43:41

72 Tweet

19 Followers

173 Following

Arya (@aryagxr) 's Twitter Profile Photo

here's an update on RLing wordle, my RFT run is all setup for bootstrapping from an initial set of wordle prompts and feedback history. what's next is to intuit how different config params improve reasoning + speedups, (topK, temperature, beta, etc) but for now I have code that

here's an update on RLing wordle,

my RFT run is all setup for bootstrapping from an initial set of wordle prompts and feedback history.

what's next is to intuit how different config params improve reasoning + speedups, (topK, temperature, beta, etc) but for now I have code that
Arya (@aryagxr) 's Twitter Profile Photo

+1 for this you can really tell that a lot of work has gone into turning off sycophancy in gpt5, and I’m really liking it. ( i noticed emojis have reduced too ! )

Arya (@aryagxr) 's Twitter Profile Photo

<experimenting> the purple run is what loose clipping and dense rewards are doing for me, clip=1.0, lr=1e-4, weight_decay=1.0 and thanks to Anish for some generous brev compute credits!

&lt;experimenting&gt;
the purple run is what loose clipping and dense rewards are doing for me,
clip=1.0, lr=1e-4, weight_decay=1.0

and thanks to <a href="/athreesh/">Anish</a> for some generous brev compute credits!
Arya (@aryagxr) 's Twitter Profile Photo

really fun read we need evals that probe language models to output “what it sees”, not “what it thinks it sees” also very cool interp approach to visualize learned world model

Arya (@aryagxr) 's Twitter Profile Photo

best pipeline to learn something new is to find blogs/papers to read, implement a tiny version from scratch, document it, hack on existing oss