Writer (@get_writer) 's Twitter Profile
Writer

@get_writer

Writer is where the world’s leading enterprises orchestrate AI-powered work | Dream Big, Build Fast | Fueled by our Palmyra LLMs

ID: 329933902

linkhttp://writer.com calendar_today05-07-2011 21:01:10

1,1K Tweet

7,7K Followers

35 Following

Writer (@get_writer) 's Twitter Profile Photo

New research from the Writer team: We trained LLMs to improve their reasoning by rewarding the skill of self-reflection using reinforcement learning (GRPO) on a model’s reflection tokens, not its output. The result: 👉 Improved performance on verifiable tasks (up to 34.7% on

New research from the Writer team: We trained LLMs to improve their reasoning by rewarding the skill of self-reflection using reinforcement learning (GRPO) on a model’s reflection tokens, not its output.

The result:
👉 Improved performance on verifiable tasks (up to 34.7% on