Adam Ibrahim (@ai_phd) Twitter Tweets • TwiCopy

Mats L. Richter @ ICLR2024

3 years ago

Rarely been so excited about a paper. Our model has a quality level higher than Stable Diffusion 2.1 at a fraction (less than 12%) of the training cost, less than 20% of the carbon footprint, and it is twice as fast at inference too! That's what I call a leap forward.

thumb_up_off_alt19

chat_bubble_outline0

repeat3

shareShare

Reyhane Askari

@reyhaneaskari

3 years ago

(1/8) The great success of diffusion models such as Stable Diffusion, DALLE & Emu, have raised questions about the use of synthetic data for classification. Our work, "Feedback-guided Data Synthesis for Imbalanced Classification," addresses this question: arxiv.org/abs/2310.00158

thumb_up_off_alt112

chat_bubble_outline2

repeat28

shareShare

Irina Rish

@irinarish

2 years ago

@PranshuRanjan1 Sarvam AI Hi-NOLIN Hindi model will be presented by our Nolano.ai team (Tejas Vaidhya Ayush Kaushal) and collaborators from our CERC-AAI team (Kshitij Gupta Benjamin Thérien Adam Ibrahim) at the #NeurIPS2023 this Fri, at this workshop: sites.google.com/mila.quebec/6t…

thumb_up_off_alt4

chat_bubble_outline0

repeat4

shareShare

Adam Ibrahim

@ai_phd

2 years ago

Looking forward to see you at the #NeurIPS2023 #NeurIPS23 ENLSP workshop (rooms 206-207), where we'll have a poster about this work at 16:15 !

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Quentin Anthony

@quentinanthon15

2 years ago

State-space models (SSMs) like Mamba and mixture-of-experts (MoE) models like Mixtral both seek to reduce the computational cost to train/infer compared to transformers, while maintaining generation quality. Learn more in our paper: zyphra.com/blackmamba

thumb_up_off_alt356

chat_bubble_outline11

repeat60

shareShare

Aran Komatsuzaki

@arankomatsuzaki

2 years ago

Mila presents Simple and Scalable Strategies to Continually Pre-train Large Language Models Shows efficient updates to LLMs using simple strategies, achieving re-training results with less compute arxiv.org/abs/2403.08763

thumb_up_off_alt334

chat_bubble_outline5

repeat77

shareShare

AK

@_akhaliq

2 years ago

Simple and Scalable Strategies to Continually Pre-train Large Language Models Large language models (LLMs) are routinely pre-trained on billions of tokens, only to start the process over again once new data becomes available. A much more efficient solution is to continually

thumb_up_off_alt282

chat_bubble_outline8

repeat49

shareShare

Adam Ibrahim

@ai_phd

2 years ago

Here is the full paper of the continual pretraining project I have been working on last year. I encourage you to check it out if you pretrain LLMs (in particular, I recommend to start with takeaways in Section 2 and the Table of Contents at the start of the appendix).

thumb_up_off_alt33

chat_bubble_outline1

repeat14

shareShare

Timothée Lesort

@tlesort

2 years ago

Look at our preprint on Continual Learning for increasing the scalability of LLMs pretraining. A great piece of work led by Adam Ibrahim Benjamin Thérien and Kshitij Gupta 🔥

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Rylan Schaeffer

@rylanschaeffer

2 years ago

❤️‍🔥❤️‍🔥Excited to share our new paper ❤️‍🔥❤️‍🔥 **Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?** w/ Hailey Schoelkopf Brando Miranda Gabriel Mukobi Varun Madan Herbie Bradley Adam Ibrahim Stella Biderman Sanmi Koyejo arxiv.org/abs/2406.04391 1/N

❤️‍🔥❤️‍🔥Excited to share our new paper ❤️‍🔥❤️‍🔥

**Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?**

w/ <a href="/haileysch__/">Hailey Schoelkopf</a> <a href="/BrandoHablando/">Brando Miranda</a> <a href="/gabemukobi/">Gabriel Mukobi</a> <a href="/varunrmadan/">Varun Madan</a> <a href="/herbiebradley/">Herbie Bradley</a> <a href="/ai_phd/">Adam Ibrahim</a> <a href="/BlancheMinerva/">Stella Biderman</a> <a href="/sanmikoyejo/">Sanmi Koyejo</a>

arxiv.org/abs/2406.04391

1/N

thumb_up_off_alt259

chat_bubble_outline9

repeat53

shareShare

Rylan Schaeffer

@rylanschaeffer

2 years ago

Excited to announce our paper ⬇️ was selected as an **Outstanding** paper at TiFA Workshop ICML 🔥🔥🔥 What did the paper show? Let's try to summarize the paper in a single tweet!! 1/3

thumb_up_off_alt56

chat_bubble_outline1

repeat7

shareShare