Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile
Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

CEO @SophontAI |
PhD at 19 (2023) |
Founder, ex CEO @MedARC_AI |
ex Research Director Stability AI |
Biomed. engineer @ 14 |
TEDx talk➡bit.ly/3tpAuan

ID: 441465751

linkhttps://tanishq.ai calendar_today20-12-2011 03:45:50

16,16K Tweet

75,75K Followers

1,1K Following

Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs "The Pass@K metric itself is a flawed measure of reasoning, as it credits correct final answers that probably arise from inaccurate or incomplete chains of thought (CoTs). To

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

"The Pass@K metric itself is a flawed measure of reasoning, as it credits correct final answers that probably arise from inaccurate or incomplete chains of thought (CoTs). To