Saurabh Srivastava (@_saurabh) 's Twitter Profile
Saurabh Srivastava

@_saurabh

Building the next stage of AI @ Essential AI
Previously: 2x YC (W15, S18); PhD + Postdoc in Program Synthesis

ID: 17696643

calendar_today28-11-2008 02:32:24

171 Tweet

929 Followers

721 Following

Saurabh Srivastava (@_saurabh) 's Twitter Profile Photo

More than 50% of the reported reasoning abilities of LLMs might not be true reasoning. How do we evaluate models trained on the entire internet? I.e., what novel questions can we ask of something that has seen all written knowledge? Below: new eval, results, code, and paper.

More than 50% of the reported reasoning abilities of LLMs might not be true reasoning.

How do we evaluate models trained on the entire internet? I.e., what novel questions can we ask of something that has seen all written knowledge? Below: new eval, results, code, and paper.