Shoaib Ahmed Siddiqui (@shoaibasiddiqui) Twitter Tweets • TwiCopy

Shoaib Ahmed Siddiqui

@shoaibasiddiqui

+ Follow

PhD student @CambridgeMLG | Ex-intern @MSR @NVIDIA @DFKI | Primarily interested in SSL, LLMs, data auditing, and empirical theory of deep learning

ID: 3124646343

linkhttp://shoaibahmed.github.io calendar_today28-03-2015 20:01:10

134 Tweet

690 Takipçi

4,4K Takip Edilen

Pavlo Molchanov

@pavlomolchanov

a year ago

🚀 Exciting findings from our recent work on depth pruning in LLMs! 1️⃣ MMLU isn't fully indicative; reasoning tasks like GSM8k drop sharply. 2️⃣ Attention layers can be pruned more than MLP layers with less impact. 3️⃣ Output target loss compression (Shapley metric) performs best.

thumb_up_off_alt19

chat_bubble_outline0

repeat6

shareShare

Pavlo Molchanov

@pavlomolchanov

a year ago

🚀 We've pruned LLaMa3.1 down to 4B parameters, delivering a smaller and more efficient model! Based on our recent paper: arxiv.org/abs/2407.14679 📖 Learn all about it in our blog: developer.nvidia.com/blog/how-to-pr… 🔗 META's announcement: ai.meta.com/blog/nvidia-ll… 👐 Checkpoints at HF this

thumb_up_off_alt311

chat_bubble_outline8

repeat90

shareShare

Pavlo Molchanov

@pavlomolchanov

a year ago

🌟 The best 8B Base model via pruning and distillation! 🚀 Introducing Mistral-NeMo-Minitron-8B-Base model we derived from the recent Mistral-NeMo-12B. Our recipe: finetune teacher on 100B tokens, prune to 8B params, run teacher-student distillation on <400B tokens. Result: the

thumb_up_off_alt151

chat_bubble_outline4

repeat51

shareShare

Kamyar Azizzadenesheli

@azizzadenesheli

a year ago

AI+Weather/Climate A thorough study of problem design in AI+Weather/Climate. As a new field, there has been an urgent need to establish the importance of design components in weather+AI. What matters, by how much, and with what cost. A study that we aim to address. We study

thumb_up_off_alt61

chat_bubble_outline6

repeat8

shareShare

Shoaib Ahmed Siddiqui

@shoaibasiddiqui

a year ago

Really glad to be able to finally share what I worked on during my internship with Kamyar Azizzadenesheli and others at NVIDIA

thumb_up_off_alt11

chat_bubble_outline0

repeat2

shareShare

Cambridge MLG

@cambridgemlg

a year ago

✨Applications are now open for PhDs at the Cambridge Machine Learning Group!✨ We're looking for outstanding candidates interested in fundamental ML research and applications to scientific domains! More info: mlg.eng.cam.ac.uk/phd_programme_… 🧵Find more about PIs & focus areas below!

thumb_up_off_alt185

chat_bubble_outline1

repeat62

shareShare

Ali Shahin Shamsabadi

@alishahinshams1

a year ago

Intern position at Brave : brave.com/careers/ My team is looking for strong students interested in private, secure and trustworthy ML. Feel free to email me with the subject line "Brave Internship 2025" and highlight your 3 most significant publications on these topics.

thumb_up_off_alt50

chat_bubble_outline0

repeat16

shareShare

Max Nadeau

@maxnadeau_

10 months ago

🧵 Announcing Open Philanthropy's Technical AI Safety RFP! We're seeking proposals across 21 research areas to help make AI systems more trustworthy, rule-following, and aligned, even as they become more capable.

🧵 Announcing <a href="/open_phil/">Open Philanthropy</a>'s Technical AI Safety RFP!

We're seeking proposals across 21 research areas to help make AI systems more trustworthy, rule-following, and aligned, even as they become more capable.

thumb_up_off_alt252

chat_bubble_outline4

repeat83

shareShare

David Krueger

@davidskrueger

9 months ago

6/13 ICLR submissions accepted! (3 posters, 2 orals, 1 blog post) Congrats to Bruno Mlodozeniec Richard Turner Clement Neo Fazl Barez Neel Alex Shoaib Ahmed Siddiqui Tom Bush Cas (Stephen Casper) Dylan HadfieldMenell and all the other authors! Summaries in thread below... 🧵🧵

thumb_up_off_alt65

chat_bubble_outline3

repeat10

shareShare

David Krueger

@davidskrueger

9 months ago

POSTER 1) Protecting against simultaneous data poisoning attacks Neel Alex, Shoaib Ahmed Siddiqui, et al. We introduce a more realistic setting: training data is poisoned in multiple ways. Existing methods fail, but our defense based on training dynamics works arxiv.org/abs/2408.13221

POSTER 1) Protecting against simultaneous data poisoning attacks Neel Alex, <a href="/ShoaibASiddiqui/">Shoaib Ahmed Siddiqui</a>, et al.

We introduce a more realistic setting: training data is poisoned in multiple ways.
Existing methods fail, but our defense based on training dynamics works
arxiv.org/abs/2408.13221

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare