Adam Ibrahim (@ai_phd) 's Twitter Profile
Adam Ibrahim

@ai_phd

ID: 1140745246877372421

linkhttps://www.adamibrahim.fr/ calendar_today17-06-2019 22:17:15

73 Tweet

534 Followers

430 Following

Mats L. Richter @ ICLR2024 (@m_l_richter) 's Twitter Profile Photo

Rarely been so excited about a paper. Our model has a quality level higher than Stable Diffusion 2.1 at a fraction (less than 12%) of the training cost, less than 20% of the carbon footprint, and it is twice as fast at inference too! That's what I call a leap forward.

Reyhane Askari (@reyhaneaskari) 's Twitter Profile Photo

(1/8) The great success of diffusion models such as Stable Diffusion, DALLE & Emu, have raised questions about the use of synthetic data for classification. Our work, "Feedback-guided Data Synthesis for Imbalanced Classification," addresses this question: arxiv.org/abs/2310.00158

Irina Rish (@irinarish) 's Twitter Profile Photo

@PranshuRanjan1 Sarvam AI Hi-NOLIN Hindi model will be presented by our Nolano.ai team (Tejas Vaidhya Ayush Kaushal) and collaborators from our CERC-AAI team (Kshitij Gupta Benjamin Thérien Adam Ibrahim) at the #NeurIPS2023 this Fri, at this workshop: sites.google.com/mila.quebec/6t…

Adam Ibrahim (@ai_phd) 's Twitter Profile Photo

Looking forward to see you at the #NeurIPS2023 #NeurIPS23 ENLSP workshop (rooms 206-207), where we'll have a poster about this work at 16:15 !

Quentin Anthony (@quentinanthon15) 's Twitter Profile Photo

State-space models (SSMs) like Mamba and mixture-of-experts (MoE) models like Mixtral both seek to reduce the computational cost to train/infer compared to transformers, while maintaining generation quality. Learn more in our paper: zyphra.com/blackmamba

State-space models (SSMs) like Mamba and mixture-of-experts (MoE) models like Mixtral both seek to reduce the computational cost to train/infer compared to transformers, while maintaining generation quality.

Learn more in our paper: zyphra.com/blackmamba
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Mila presents Simple and Scalable Strategies to Continually Pre-train Large Language Models Shows efficient updates to LLMs using simple strategies, achieving re-training results with less compute arxiv.org/abs/2403.08763

Mila presents Simple and Scalable Strategies to Continually Pre-train Large Language Models

Shows efficient updates to LLMs using simple strategies, achieving re-training results with less compute

arxiv.org/abs/2403.08763
AK (@_akhaliq) 's Twitter Profile Photo

Simple and Scalable Strategies to Continually Pre-train Large Language Models Large language models (LLMs) are routinely pre-trained on billions of tokens, only to start the process over again once new data becomes available. A much more efficient solution is to continually

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Large language models (LLMs) are routinely pre-trained on billions of tokens, only to start the process over again once new data becomes available. A much more efficient solution is to continually
Adam Ibrahim (@ai_phd) 's Twitter Profile Photo

Here is the full paper of the continual pretraining project I have been working on last year. I encourage you to check it out if you pretrain LLMs (in particular, I recommend to start with takeaways in Section 2 and the Table of Contents at the start of the appendix).

Rylan Schaeffer (@rylanschaeffer) 's Twitter Profile Photo

❤️‍🔥❤️‍🔥Excited to share our new paper ❤️‍🔥❤️‍🔥 **Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?** w/ Hailey Schoelkopf Brando Miranda Gabriel Mukobi Varun Madan Herbie Bradley Adam Ibrahim Stella Biderman Sanmi Koyejo arxiv.org/abs/2406.04391 1/N

❤️‍🔥❤️‍🔥Excited to share our new paper ❤️‍🔥❤️‍🔥

**Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?**

w/ <a href="/haileysch__/">Hailey Schoelkopf</a> <a href="/BrandoHablando/">Brando Miranda</a> <a href="/gabemukobi/">Gabriel Mukobi</a> <a href="/varunrmadan/">Varun Madan</a> <a href="/herbiebradley/">Herbie Bradley</a> <a href="/ai_phd/">Adam Ibrahim</a>  <a href="/BlancheMinerva/">Stella Biderman</a> <a href="/sanmikoyejo/">Sanmi Koyejo</a>

arxiv.org/abs/2406.04391

1/N
Rylan Schaeffer (@rylanschaeffer) 's Twitter Profile Photo

Excited to announce our paper ⬇️ was selected as an **Outstanding** paper at TiFA Workshop ICML 🔥🔥🔥 What did the paper show? Let's try to summarize the paper in a single tweet!! 1/3