Muqeeth
@muqeeth10
Researching strategic behaviors of LLMs in social dilemmas. Grad Student @Mila_Quebec, Former RE @MITIBMLab | MS @unccs | RA @iitdelhi | BTech @iitmadras
ID: 864049704924884992
http://muqeeth.github.io 15-05-2017 09:27:57
19 Tweet
140 Followers
383 Following
Zero rewards after tons of RL training? 😞 Before using dense rewards or incentivizing exploration, try changing the data. Adding easier instances of the task can unlock RL training. 🔓📈To know more checkout our blog post here: spiffy-airbus-472.notion.site/What-Can-You-D…. Keep reading 🧵(1/n)