emily mcmilin (@micmylin) 's Twitter Profile
emily mcmilin

@micmylin

RL and world models for coding at FAIR

ID: 18296286

linkhttps://www.linkedin.com/in/emilymcmilin/ calendar_today22-12-2008 00:22:05

143 Tweet

648 Followers

600 Following

emily mcmilin (@micmylin) 's Twitter Profile Photo

Now accepted at #AAAI24. I started single author, indie research as a ~jr SWE, recently transitioned from hardware engineering. Learned so much along the way, with help from MLC's Rosanne Liu & Jason Yosinski, Cohere4AI's Jen Iofinova & Sara Hooker, HF's Sasha Luccioni, PhD 🦋🌎✨🤗 and anon peers. Grateful.

emily mcmilin (@micmylin) 's Twitter Profile Photo

Thanks to all who stopped by my poster last night AAAI. If you are interested in talking more about causality and LLMs here at #AAAI24 or beyond, please reach out!

Thanks to all who stopped by my poster last night <a href="/RealAAAI/">AAAI</a>.

If you are interested in talking more about causality and LLMs here at #AAAI24 or beyond, please reach out!
Udacity (@udacity) 's Twitter Profile Photo

💡 Interested in learning more about LLM fundamentals? In the video below, Udacity instructor Emily McMilin explains what the Transformer model is & walks you through the difference between Encoder and Decoder model architectures. bit.ly/44f0eJn #genAI #generativeAI

💡 Interested in learning more about LLM fundamentals? 

In the video below, Udacity instructor Emily McMilin explains what the Transformer model is &amp; walks you through the difference between Encoder and Decoder model architectures. 

bit.ly/44f0eJn

#genAI #generativeAI
emily mcmilin (@micmylin) 's Twitter Profile Photo

Our research showing how task underspecification can cause spurious correlations & hallucinations, from BERT to GPT-3.5 is now available as AAAI 24 proceedings: ojs.aaai.org/index.php/AAAI… Video: underline.io/lecture/92119-… Arxiv extended to GPT-4 Turbo Preview: arxiv.org/abs/2210.00131

Yuxiang Wei (@yuxiangwei9) 's Twitter Profile Photo

Software agents can self-improve via self-play RL Introducing Self-play SWE-RL (SSR): training a single LLM agent to self-play between bug-injection and bug-repair, grounded in real-world repositories, no human-labeled issues or tests. 🧵

Software agents can self-improve via self-play RL

Introducing Self-play SWE-RL (SSR): training a single LLM agent to self-play between bug-injection and bug-repair, grounded in real-world repositories, no human-labeled issues or tests. 🧵