Colin Raffel(@colinraffel) 's Twitter Profileg
Colin Raffel

@colinraffel

nonbayesian parameterics, sweet lessons, and random birds.
Friend of @srush_nlp

ID:837133583558987776

linkhttp://www.colinraffel.com calendar_today02-03-2017 02:52:54

1,5K Tweets

30,2K Followers

654 Following

Aran Komatsuzaki(@arankomatsuzaki) 's Twitter Profile Photo

🚀 Introducing Pile-T5!

🔗 We (EleutherAI) are thrilled to open-source our latest T5 model trained on 2T tokens from the Pile using the Llama tokenizer.

✨ Featuring intermediate checkpoints and a significant boost in benchmark performance.

Work done by Lintang Sutawika, me…

🚀 Introducing Pile-T5! 🔗 We (EleutherAI) are thrilled to open-source our latest T5 model trained on 2T tokens from the Pile using the Llama tokenizer. ✨ Featuring intermediate checkpoints and a significant boost in benchmark performance. Work done by @lintangsutawika, me…
account_circle
Adam Roberts(@ada_rob) 's Twitter Profile Photo

I love music most when it’s live, in the moment, and expressing something personal.

This is why I’m psyched about the new “DJ mode” we developed for MusicFX: aitestkitchen.withgoogle.com/tools/music-fx…

It’s an infinite AI jam that you control 🎛️. Try mixing your unique 🌀 of instruments, genres,…

account_circle
Alon Albalak(@AlbalakAlon) 's Twitter Profile Photo

{UCSB|AI2|UW|Stanford|MIT|UofT|Vector|Contextual AI} present a survey on🔎Data Selection for LLMs🔍

Training data is a closely guarded secret in industry🤫with this work we narrow the knowledge gap, advocating for open, responsible, collaborative progress
arxiv.org/abs/2402.16827

{UCSB|AI2|UW|Stanford|MIT|UofT|Vector|Contextual AI} present a survey on🔎Data Selection for LLMs🔍 Training data is a closely guarded secret in industry🤫with this work we narrow the knowledge gap, advocating for open, responsible, collaborative progress arxiv.org/abs/2402.16827
account_circle
Shachar Don-Yehiya(@Shachar_Don) 's Twitter Profile Photo

Crowd-sourcing human feedback for open-source LLMs? 💬🤖

Let's make it happen together! 💪

chromewebstore.google.com/detail/sharelm…

W ♻️ Leshem Choshen ♻️ Omri Abend

account_circle
Colin Raffel(@colinraffel) 's Twitter Profile Photo

I'll be at supporting my collaborators who are presenting arxiv.org/abs/2306.01708, arxiv.org/abs/2305.16264, arxiv.org/abs/2302.00674, and neurips.cc/virtual/2023/p…. Find me to chat about decentralizing/democratizing/de-risking ML!

account_circle
Derek Tam(@dtredsox13) 's Twitter Profile Photo

New preprint! Introducing MaTS - a new framework for merging individual task models into a multitask model by matching them in their task subspace

Work done w/ Mohit Bansal Colin Raffel

📄 arxiv.org/abs/2312.04339
💾 github.com/r-three/mats
🧵 ⬇️

New preprint! Introducing MaTS - a new framework for merging individual task models into a multitask model by matching them in their task subspace Work done w/ @mohitban47 @colinraffel 📄 arxiv.org/abs/2312.04339 💾 github.com/r-three/mats 🧵 ⬇️
account_circle
Colin Raffel(@colinraffel) 's Twitter Profile Photo

New blog post where I argue that 'large language model development' can be considered a new subfield that grew out of deep learning, NLP, etc. and reflect on what to do when your field of study gives birth to a new one: craffel.github.io/blog/language-…

account_circle
Colin Raffel(@colinraffel) 's Twitter Profile Photo

Also, I am 1000% hiring PhD students this round! If you want to work on
- open models
- collaborative/decentralized training
- building models like OSS
- coordinating model ecosystems
- mitigating risks
you should definitely apply! Deadline is Friday 😬
web.cs.toronto.edu/graduate/how-t…

account_circle
Prateek Yadav(@prateeky2806) 's Twitter Profile Photo

Presenting ComPEFT 🗜!

We compress parameter updates to facilitate efficient communication of expert models for compositional generalization. ComPEFT improves perf. 📈, while reducing storage/communication costs 📉

buff.ly/49Qaryo
♻️ Leshem Choshen ♻️ Colin Raffel Mohit Bansal
🧵

Presenting ComPEFT 🗜! We compress parameter updates to facilitate efficient communication of expert models for compositional generalization. ComPEFT improves perf. 📈, while reducing storage/communication costs 📉 buff.ly/49Qaryo @LChoshen @colinraffel @mohitban47 🧵
account_circle
Haikang Deng(@HaikangDeng) 's Twitter Profile Photo

Introducing RAD, a cheap and efficient method for using an auxiliary reward model for controlling text generation that can match the performance of methods that update the LM.

📝arxiv.org/abs/2310.09520
💾github.com/haikangdeng/RAD
🧵⬇️

1/

Introducing RAD, a cheap and efficient method for using an auxiliary reward model for controlling text generation that can match the performance of methods that update the LM. 📝arxiv.org/abs/2310.09520 💾github.com/haikangdeng/RAD 🧵⬇️ 1/
account_circle
Jaan Lı 李 PhD (e/des in CDMX)(@thejaan) 's Twitter Profile Photo

Looking for a full-time role in AI / large language models (research/eng potentially focused on health).

Know anyone to chat with?

Please RT/forward my CV (jaan.io/cv)/DM/connect me. I've built large language models, have 1000+ citations (NeurIPS, ICML, AISTATS).

account_circle
Derek Tam(@dtredsox13) 's Twitter Profile Photo

Our work on Data Augmentation for Learning from Limited Data has been accepted to ! We are presenting it at on Wed 11:00-12:30 in Session 7.
Paper: transacl.org/index.php/tacl…
Poster + Video: virtual2023.aclweb.org/paper_T4291.ht…

Jiaao Chen Colin Raffel Mohit Bansal Diyi Yang

account_circle
Brian Lester(@blester125) 's Twitter Profile Photo

We just pushed a new update adding support for the (very impressive) safetensors library from our friends at Hugging Face!

Git-Theta's plug-in system meant that we spent more time waiting on CI/CD than actually adding support (I'll get off my soapbox now 🧼📦).

account_circle