Dheeraj Mekala
@mekaladheeraj
Ph.D. student at @UCSanDiego. Research Scientist Intern at Llama Research @MetaAI
Previously FAIR, @msftresearch, @AmazonScience, @iitkanpur
Data! Data! Data!
ID: 1003938762382905344
http://dheeraj7596.github.io/ 05-06-2018 09:57:08
732 Tweet
1,1K Followers
374 Following
LLMs act sub-optimally in decisions due to greediness, frequency bias, and a knowing-doing gap. A classic Google DeepMind paper. Shows why LLM agents make poor decisions and how reinforcement learning fine-tuning fixes a chunk of it. Tic-tac-toe win rate jumps 15% to 75% after
Wanna upgrade your agent game? With AI at Meta , we're releasing 2 incredibly cool artefacts: - GAIA 2: assistant evaluation with a twist (new: adaptability, robustness to failure & time sensitivity) - ARE, an agent research environment to empower all! huggingface.co/blog/gaia2