Colin Raffel (@colinraffel) Twitter Tweets • TwiCopy

Colin Raffel

@colinraffel

+ Follow

nonbayesian parameterics, sweet lessons, and random birds.
Friend of @srush_nlp

ID:837133583558987776

linkhttp://www.colinraffel.com calendar_today02-03-2017 02:52:54

1,5K Tweets

30,2K Followers

654 Following

Aran Komatsuzaki

@arankomatsuzaki

1 week ago

🚀 Introducing Pile-T5!

🔗 We (EleutherAI) are thrilled to open-source our latest T5 model trained on 2T tokens from the Pile using the Llama tokenizer.

✨ Featuring intermediate checkpoints and a significant boost in benchmark performance.

Work done by Lintang Sutawika, me…

🚀 Introducing Pile-T5! 🔗 We (EleutherAI) are thrilled to open-source our latest T5 model trained on 2T tokens from the Pile using the Llama tokenizer. ✨ Featuring intermediate checkpoints and a significant boost in benchmark performance. Work done by @lintangsutawika, me…

thumb_up_off_alt547

chat_bubble_outline0

account_circle

Adam Roberts

1 month ago

I love music most when it’s live, in the moment, and expressing something personal.

This is why I’m psyched about the new “DJ mode” we developed for MusicFX: aitestkitchen.withgoogle.com/tools/music-fx…

It’s an infinite AI jam that you control 🎛️. Try mixing your unique 🌀 of instruments, genres,…

thumb_up_off_alt443

chat_bubble_outline0

account_circle

Alon Albalak

1 month ago

{UCSB|AI2|UW|Stanford|MIT|UofT|Vector|Contextual AI} present a survey on🔎Data Selection for LLMs🔍

Training data is a closely guarded secret in industry🤫with this work we narrow the knowledge gap, advocating for open, responsible, collaborative progress
arxiv.org/abs/2402.16827

{UCSB|AI2|UW|Stanford|MIT|UofT|Vector|Contextual AI} present a survey on🔎Data Selection for LLMs🔍 Training data is a closely guarded secret in industry🤫with this work we narrow the knowledge gap, advocating for open, responsible, collaborative progress arxiv.org/abs/2402.16827

thumb_up_off_alt304

chat_bubble_outline0

account_circle

Shachar Don-Yehiya

3 months ago

Crowd-sourcing human feedback for open-source LLMs? 💬🤖

Let's make it happen together! 💪

chromewebstore.google.com/detail/sharelm…

W ♻️ Leshem Choshen ♻️ Omri Abend

thumb_up_off_alt86

chat_bubble_outline0

account_circle

Adam Roberts

4 months ago

T5 Reunion!

(Noam Shazeer was replaced by a sentinel token)

T5 Reunion! (@NoamShazeer was replaced by a sentinel token)

thumb_up_off_alt255

chat_bubble_outline0

account_circle

Colin Raffel

4 months ago

I'll be at #NeurIPS2023 supporting my collaborators who are presenting arxiv.org/abs/2306.01708, arxiv.org/abs/2305.16264, arxiv.org/abs/2302.00674, and neurips.cc/virtual/2023/p…. Find me to chat about decentralizing/democratizing/de-risking ML!

thumb_up_off_alt147

chat_bubble_outline0

account_circle

Derek Tam

4 months ago

New preprint! Introducing MaTS - a new framework for merging individual task models into a multitask model by matching them in their task subspace

Work done w/ Mohit Bansal Colin Raffel

📄 arxiv.org/abs/2312.04339
💾 github.com/r-three/mats
🧵 ⬇️

New preprint! Introducing MaTS - a new framework for merging individual task models into a multitask model by matching them in their task subspace Work done w/ @mohitban47 @colinraffel 📄 arxiv.org/abs/2312.04339 💾 github.com/r-three/mats 🧵 ⬇️

thumb_up_off_alt65

chat_bubble_outline0

account_circle

Colin Raffel

4 months ago

New blog post where I argue that 'large language model development' can be considered a new subfield that grew out of deep learning, NLP, etc. and reflect on what to do when your field of study gives birth to a new one: craffel.github.io/blog/language-…

thumb_up_off_alt502

chat_bubble_outline0

account_circle

Colin Raffel

4 months ago

Also, I am 1000% hiring PhD students this round! If you want to work on
- open models
- collaborative/decentralized training
- building models like OSS
- coordinating model ecosystems
- mitigating risks
you should definitely apply! Deadline is Friday 😬
web.cs.toronto.edu/graduate/how-t…

thumb_up_off_alt458

chat_bubble_outline0

account_circle

Prateek Yadav

5 months ago

Presenting ComPEFT 🗜!

We compress parameter updates to facilitate efficient communication of expert models for compositional generalization. ComPEFT improves perf. 📈, while reducing storage/communication costs 📉

buff.ly/49Qaryo
♻️ Leshem Choshen ♻️ Colin Raffel Mohit Bansal
🧵

Presenting ComPEFT 🗜! We compress parameter updates to facilitate efficient communication of expert models for compositional generalization. ComPEFT improves perf. 📈, while reducing storage/communication costs 📉 buff.ly/49Qaryo @LChoshen @colinraffel @mohitban47 🧵

thumb_up_off_alt229

chat_bubble_outline0

account_circle

Haikang Deng

6 months ago

Introducing RAD, a cheap and efficient method for using an auxiliary reward model for controlling text generation that can match the performance of methods that update the LM.

📝arxiv.org/abs/2310.09520
💾github.com/haikangdeng/RAD
🧵⬇️

1/

Introducing RAD, a cheap and efficient method for using an auxiliary reward model for controlling text generation that can match the performance of methods that update the LM. 📝arxiv.org/abs/2310.09520 💾github.com/haikangdeng/RAD 🧵⬇️ 1/

thumb_up_off_alt160

chat_bubble_outline0

account_circle

Jaan Lı 李 PhD (e/des in CDMX)

7 months ago

Looking for a full-time role in AI / large language models (research/eng potentially focused on health).

Know anyone to chat with?

Please RT/forward my CV (jaan.io/cv)/DM/connect me. I've built large language models, have 1000+ citations (NeurIPS, ICML, AISTATS).

thumb_up_off_alt188

chat_bubble_outline0

account_circle

Alec Jacobson

7 months ago

The 100 Most Influential People Named Al

The 100 Most Influential People Named Al

thumb_up_off_alt49

chat_bubble_outline0

account_circle

Derek Tam

9 months ago

Our work on Data Augmentation for Learning from Limited Data has been accepted to #TACL ! We are presenting it at #ACL2023 on Wed 11:00-12:30 in Session 7.
Paper: transacl.org/index.php/tacl…
Poster + Video: virtual2023.aclweb.org/paper_T4291.ht…

Jiaao Chen Colin Raffel Mohit Bansal Diyi Yang

thumb_up_off_alt36

chat_bubble_outline0

account_circle

Brian Lester

10 months ago

We just pushed a new update adding support for the (very impressive) safetensors library from our friends at Hugging Face!

Git-Theta's plug-in system meant that we spent more time waiting on CI/CD than actually adding support (I'll get off my soapbox now 🧼📦).

thumb_up_off_alt21

chat_bubble_outline0

account_circle

fpc ok :)