Kyle Lo @ ICLR 2024
@kylelostat
#nlproc #hci Leading Data Research for OLMo @allen_ai, he/him, https://t.co/5Hm9cx3mC1
ID:1080639531429183488
http://kyleclo.com 03-01-2019 01:38:36
411 Tweets
2,1K Followers
1,1K Following
🌟Several dataset releases deserve a mention for their incredible data measurement work 🌟
➡️ The Pile (arxiv.org/abs/2101.00027) Leo Gao Stella Biderman
➡️ ROOTS (arxiv.org/abs/2303.03915) Hugo Laurençon++
➡️ Dolma (arxiv.org/abs/2402.00159) Luca Soldaini 🎀 Kyle Lo
14/
truly cursed timeline 😵💫
ACLRollingReview reviews due Mar 20
Conference on Language Modeling abstract deadline Mar 22
ACLRollingReview reviews released Mar 26
Conference on Language Modeling submission deadline Mar 29
ACLRollingReview rebuttal period closes Mar 30
can LMs help us write expository answers to scientific research questions?
excited to share our work led by Fangyuan Xu. we recruited NLP folks to work with an LM to answer research questions and logged successes/failures in sustained interaction traces🦉
New Resource: Foundation Model Development Cheatsheet for best practices
We compiled 250+ resources & tools for:
🔭 sourcing data
🔍 documenting & audits
🌴 environmental impact
☢️ risks & harms eval
🌍 release & monitoring
With experts from EleutherAI, Allen Institute for AI,…
We just uploaded detailed Weights & Biases training logs for the OLMo 7B run: wandb.ai/ai2-llm/OLMo-7…
This is a cleaned-up version from the actual run, so the wall clock times don't make sense, but all the other information is there!
excited to share our contribution to open science of language models!
🐈⬛ all our data, weights, ckpts, code, etc
🐈 covers data curation, pretraining, adaptation, evaluation, etc
check out more deets in Luca Soldaini 🎀 ‘s thread, technical reports out on arXiv shortly 😆