Shruti Joshi (@_shruti_joshi_) 's Twitter Profile
Shruti Joshi

@_shruti_joshi_

phd student in identifiable repl @Mila_Quebec. prev. research programmer @MPI_IS Tรผbingen, undergrad @IITKanpur '19.

ID: 1026105870818529281

linkhttps://shrutij01.github.io/ calendar_today05-08-2018 14:01:18

176 Tweet

375 Takipรงi

817 Takip Edilen

Sรฉbastien Lachapelle (@seblachap) 's Twitter Profile Photo

1/ Excited for our oral presentation at #NeurIPS2023 on "Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation"! A theoretical paper about object-centric representation learning (OCRL), disentanglement & extrapolation arxiv.org/abs/2307.02598

1/ Excited for our oral presentation at #NeurIPS2023 on "Additive Decoders for Latent Variables Identification and Cartesian-Product Extrapolation"!

A theoretical paper about object-centric representation learning (OCRL), disentanglement & extrapolation

arxiv.org/abs/2307.02598
Arkil Patel (@arkil_patel) 's Twitter Profile Photo

Presenting tomorrow at #EMNLP2023: MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations w/ amazing advisors and collaborators ๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau, Siva Reddy, and Satwik Bhattamishra

Presenting tomorrow at #EMNLP2023:

MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations

w/ amazing advisors and collaborators <a href="/DBahdanau/">๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau</a>, <a href="/sivareddyg/">Siva Reddy</a>, and <a href="/satwik1729/">Satwik Bhattamishra</a>
Nicholas Meade (@ncmeade) 's Twitter Profile Photo

Adversarial Triggers For LLMs Are ๐—ก๐—ข๐—ง ๐—จ๐—ป๐—ถ๐˜ƒ๐—ฒ๐—ฟ๐˜€๐—ฎ๐—น!๐Ÿ˜ฒ It is believed that adversarial triggers that jailbreak a model transfer universally to other models. But we show triggers don't reliably transfer, especially to RLHF/DPO models. Paper: arxiv.org/abs/2404.16020

Adversarial Triggers For LLMs Are ๐—ก๐—ข๐—ง ๐—จ๐—ป๐—ถ๐˜ƒ๐—ฒ๐—ฟ๐˜€๐—ฎ๐—น!๐Ÿ˜ฒ

It is believed that adversarial triggers that jailbreak a model transfer universally to other models. But we show triggers don't reliably transfer, especially to RLHF/DPO models.

Paper: arxiv.org/abs/2404.16020
Arkil Patel (@arkil_patel) 's Twitter Profile Photo

๐Ÿ“ข Exciting new work on AI safety! Do adversarial triggers transfer universally across models (as has been claimed)? ๐—ก๐—ผ. Are models aligned by supervised fine-tuning safe against adversarial triggers? ๐—ก๐—ผ. RLHF and DPO are far better!

Arkil Patel (@arkil_patel) 's Twitter Profile Photo

Presenting tomorrow at #NAACL2024: ๐ถ๐‘Ž๐‘› ๐ฟ๐ฟ๐‘€๐‘  ๐‘–๐‘›-๐‘๐‘œ๐‘›๐‘ก๐‘’๐‘ฅ๐‘ก ๐‘™๐‘’๐‘Ž๐‘Ÿ๐‘› ๐‘ก๐‘œ ๐‘ข๐‘ ๐‘’ ๐‘›๐‘’๐‘ค ๐‘๐‘Ÿ๐‘œ๐‘”๐‘Ÿ๐‘Ž๐‘š๐‘š๐‘–๐‘›๐‘” ๐‘™๐‘–๐‘๐‘Ÿ๐‘Ž๐‘Ÿ๐‘–๐‘’๐‘  ๐‘Ž๐‘›๐‘‘ ๐‘™๐‘Ž๐‘›๐‘”๐‘ข๐‘Ž๐‘”๐‘’๐‘ ? ๐‘Œ๐‘’๐‘ . ๐พ๐‘–๐‘›๐‘‘ ๐‘œ๐‘“. Internship Ai2 work with Pradeep Dasigi and my advisors ๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau and Siva Reddy.

Presenting tomorrow at #NAACL2024:

๐ถ๐‘Ž๐‘› ๐ฟ๐ฟ๐‘€๐‘  ๐‘–๐‘›-๐‘๐‘œ๐‘›๐‘ก๐‘’๐‘ฅ๐‘ก ๐‘™๐‘’๐‘Ž๐‘Ÿ๐‘› ๐‘ก๐‘œ ๐‘ข๐‘ ๐‘’ ๐‘›๐‘’๐‘ค ๐‘๐‘Ÿ๐‘œ๐‘”๐‘Ÿ๐‘Ž๐‘š๐‘š๐‘–๐‘›๐‘” ๐‘™๐‘–๐‘๐‘Ÿ๐‘Ž๐‘Ÿ๐‘–๐‘’๐‘  ๐‘Ž๐‘›๐‘‘ ๐‘™๐‘Ž๐‘›๐‘”๐‘ข๐‘Ž๐‘”๐‘’๐‘ ?

๐‘Œ๐‘’๐‘ . ๐พ๐‘–๐‘›๐‘‘ ๐‘œ๐‘“.

Internship <a href="/allen_ai/">Ai2</a> work with <a href="/pdasigi/">Pradeep Dasigi</a> and my advisors <a href="/DBahdanau/">๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau</a> and <a href="/sivareddyg/">Siva Reddy</a>.
Leena C Vankadara (@leenacvankadara) 's Twitter Profile Photo

I am thrilled to announce that I will be joining the Gatsby Computational Neuroscience Unit at UCL as a Lecturer (Assistant Professor) in Feb 2025! Looking forward to working with the exceptional talent at Gatsby Computational Neuroscience Unit on cutting-edge problems in deep learning and causality.

Tom Marty (@tom__marty) 's Twitter Profile Photo

๐ŸšจNEW PAPER OUT ๐Ÿšจ Excited to share our latest research initiative on in-context learning and meta-learning through the lens of Information theory !๐Ÿง  ๐Ÿ”— arxiv.org/abs/2410.14086 Check out our insights and empirical experiments! ๐Ÿ”

Sahil Verma (@sahil1v) 's Twitter Profile Photo

๐Ÿ“ฃ ๐Ÿ“ฃ ๐Ÿ“ฃ Our new paper investigates the question of how many images ๐Ÿ–ผ๏ธ of a concept are required by a diffusion model ๐Ÿค– to imitate it. This question is critical for understanding and mitigating the copyright and privacy infringements of these models! arxiv.org/abs/2410.15002

๐Ÿ“ฃ ๐Ÿ“ฃ ๐Ÿ“ฃ Our new paper investigates the question of how many images ๐Ÿ–ผ๏ธ of a concept are required by a diffusion model ๐Ÿค– to imitate it. This question is critical for understanding and mitigating the copyright and privacy infringements of these models! arxiv.org/abs/2410.15002
Arkil Patel (@arkil_patel) 's Twitter Profile Photo

Presenting โœจ ๐‚๐‡๐€๐’๐„: ๐†๐ž๐ง๐ž๐ซ๐š๐ญ๐ข๐ง๐  ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ข๐ง๐  ๐ฌ๐ฒ๐ง๐ญ๐ก๐ž๐ญ๐ข๐œ ๐๐š๐ญ๐š ๐Ÿ๐จ๐ซ ๐ž๐ฏ๐š๐ฅ๐ฎ๐š๐ญ๐ข๐จ๐ง โœจ Work w/ fantastic advisors ๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau and Siva Reddy Thread ๐Ÿงต:

Presenting โœจ ๐‚๐‡๐€๐’๐„: ๐†๐ž๐ง๐ž๐ซ๐š๐ญ๐ข๐ง๐  ๐œ๐ก๐š๐ฅ๐ฅ๐ž๐ง๐ ๐ข๐ง๐  ๐ฌ๐ฒ๐ง๐ญ๐ก๐ž๐ญ๐ข๐œ ๐๐š๐ญ๐š ๐Ÿ๐จ๐ซ ๐ž๐ฏ๐š๐ฅ๐ฎ๐š๐ญ๐ข๐จ๐ง โœจ

Work w/ fantastic advisors <a href="/DBahdanau/">๐Ÿ‡บ๐Ÿ‡ฆ Dzmitry Bahdanau</a> and <a href="/sivareddyg/">Siva Reddy</a>

Thread ๐Ÿงต:
Arkil Patel (@arkil_patel) 's Twitter Profile Photo

๐“๐ก๐จ๐ฎ๐ ๐ก๐ญ๐จ๐ฅ๐จ๐ ๐ฒ paper is out! ๐Ÿ”ฅ๐Ÿ‹ We study the reasoning chains of DeepSeek-R1 across a variety of tasks and settings and find several surprising and interesting phenomena! Incredible effort by the entire team! ๐ŸŒ: mcgill-nlp.github.io/thoughtology/

๐“๐ก๐จ๐ฎ๐ ๐ก๐ญ๐จ๐ฅ๐จ๐ ๐ฒ paper is out! ๐Ÿ”ฅ๐Ÿ‹

We study the reasoning chains of DeepSeek-R1 across a variety of tasks and settings and find several surprising and interesting phenomena!

Incredible effort by the entire team!

๐ŸŒ: mcgill-nlp.github.io/thoughtology/
Sahil Verma (@sahil1v) 's Twitter Profile Photo

๐Ÿšจ New Paper! ๐Ÿšจ Guard models slow, language-specific, and modality-limited? Meet OmniGuard that detects harmful prompts across multiple languages & modalities all using one approach with SOTA performance in all 3 modalities!! while being 120X faster ๐Ÿš€ arxiv.org/abs/2505.23856

๐Ÿšจ New Paper! ๐Ÿšจ
Guard models slow, language-specific, and modality-limited?

Meet OmniGuard that detects harmful prompts across multiple languages &amp; modalities all using one approach with SOTA performance in all 3 modalities!! while being 120X faster ๐Ÿš€

arxiv.org/abs/2505.23856