Saachi Jain (@saachi_jain_) Twitter Tweets • TwiCopy

Aleksander Madry

3 years ago

What's the right way to remove part of an image? We show that typical strategies distort model predictions and introduce bias when debugging models. Good news: leveraging ViTs enables a way to side-step this bias. Paper: arxiv.org/abs/2204.08945 Blog post: gradientscience.org/missingness

thumb_up_off_alt48

chat_bubble_outline3

repeat15

shareShare

Hadi Salman

@hadisalmanx

3 years ago

If you are attending CVPR and would like to learn about our work on certified patch defenses, pass by our poster (#178) this Thursday 2:30-5pm CDT in Hall B2-C! Saachi Jain Eric Wong and I will be there!

thumb_up_off_alt7

chat_bubble_outline0

repeat5

shareShare

Aleksander Madry

@aleks_madry

3 years ago

What kinds of fish are hard for a model to classify? Our new method (arxiv.org/abs/2206.14754) automatically identifies + captions model error patterns. The key? Distill failure modes as directions in latent space. Saachi Jain Hannah Lawrence A. Moitra Blog: gradientscience.org/failure-direct…

thumb_up_off_alt73

chat_bubble_outline0

repeat21

shareShare

Aleksander Madry

@aleks_madry

3 years ago

Does transfer learning = free accuracy? W/ Hadi Salman Saachi Jain Andrew Ilyas Logan Engstrom Eric Wong we identify one potential drawback: *bias transfer*, where biases in pre-trained models can persist after fine-tuning arxiv.org/abs/2207.02842 gradientscience.org/bias-transfer

Does transfer learning = free accuracy? W/ <a href="/hadisalmanX/">Hadi Salman</a> <a href="/saachi_jain_/">Saachi Jain</a> <a href="/andrew_ilyas/">Andrew Ilyas</a> <a href="/logan_engstrom/">Logan Engstrom</a> <a href="/RICEric22/">Eric Wong</a> we identify one potential drawback: *bias transfer*, where biases in pre-trained models can persist after fine-tuning
arxiv.org/abs/2207.02842
gradientscience.org/bias-transfer

thumb_up_off_alt114

chat_bubble_outline3

repeat28

shareShare

Aleksander Madry

@aleks_madry

3 years ago

Does your pretrained model think planes are flying lawnmowers? W/ Saachi Jain Hadi Salman Alaa Khaddaj Eric Wong Sam Park we build a framework for pinpointing the impact of pretraining data on transfer learning. Paper: arxiv.org/abs/2207.05739 Blog: gradientscience.org/data-transfer/

Does your pretrained model think planes are flying lawnmowers? W/ <a href="/saachi_jain_/">Saachi Jain</a> <a href="/hadisalmanX/">Hadi Salman</a> <a href="/Alaa_Khaddaj/">Alaa Khaddaj</a> <a href="/RICEric22/">Eric Wong</a> <a href="/smsampark/">Sam Park</a> we build a framework for pinpointing the impact of pretraining data on transfer learning.
Paper: arxiv.org/abs/2207.05739
Blog: gradientscience.org/data-transfer/

thumb_up_off_alt102

chat_bubble_outline1

repeat17

shareShare

Saachi Jain

@saachi_jain_

3 years ago

If you are attending ICML, Dimitris Tsipras and I are presenting our paper Combining Diverse Feature Priors (arxiv.org/abs/2110.08220) on Tuesday. Come say hi! Talk: Tues. 7/19, 5:40 PM EDT Room 327-329 (DL Sequential Models) Poster: Tues. 7/19, 6:30-8:30 PM EDT Hall E, Poster #508

thumb_up_off_alt46

chat_bubble_outline0

repeat7

shareShare

Andrew Ilyas

@andrew_ilyas

3 years ago

Come hear about work on datamodels (arxiv.org/abs/2202.00622) at ICML *tomorrow* in the Deep Learning/Optimization track (Rm 309)! The presentation is at 4:50 with a poster session at 6:30. Joint work with Sam Park Logan Engstrom Guillaume Leclerc Aleksander Madry

thumb_up_off_alt62

chat_bubble_outline0

repeat19

shareShare

Hannah Lawrence

@hlawrencecs

3 years ago

Group CNNs are used for their explicit inductive bias towards symmetry, but what about the implicit bias from training? With Bobak Kiani Kristian Georgiev A. Dienes, we answer this question for linear group CNNs. Check out today's talk + Poster 520 at ICML! proceedings.mlr.press/v162/lawrence2…

thumb_up_off_alt42

chat_bubble_outline1

repeat15

shareShare

Sarah Cen

@cen_sarah

3 years ago

Got the chance to talk at the Simons Institute's AI & Humanity Workshop last week! Presented two ongoing works with Andrew Ilyas Aleksander Madry Manish Raghavan on building trust in AI & the governance of data-driven algorithms. Check out the video here youtube.com/watch?v=OpFY9D…

thumb_up_off_alt85

chat_bubble_outline2

repeat14

shareShare

Aleksander Madry

@aleks_madry

3 years ago

Stable diffusion can visualize + improve model failure modes! Leveraging our method, we can generate examples of hard subpopulations, which can then be used for targeted data augmentation to improve reliability. Blog: gradientscience.org/failure-direct… Saachi Jain Hannah Lawrence A.Moitra

thumb_up_off_alt68

chat_bubble_outline0

repeat16

shareShare

Aleksander Madry

@aleks_madry

3 years ago

Will your model identify a polar bear on the moon? How would you know? Dataset Interfaces let you generate images from your dataset under whatever distribution shift you desire! arxiv.org/abs/2302.07865 gradientscience.org/dataset-interf… W/ Josh Vendrow Saachi Jain Logan Engstrom

thumb_up_off_alt132

chat_bubble_outline3

repeat31

shareShare

Saachi Jain

@saachi_jain_

2 years ago

If you are at #ICLR2023, Hannah Lawrence and I are presenting our work on distilling model failures as directions in latent space on *Wednesday*. Come say hi! Talk: 10:30, AD12 Poster: 11:30-1:30 # 59 arxiv.org/abs/2206.14754

thumb_up_off_alt59

chat_bubble_outline0

repeat18

shareShare

Saachi Jain

@saachi_jain_

2 years ago

Excited to be in Vancouver for #CVPR2023! Hadi Salman and I will be presenting our poster on a data-based perspective on transfer learning on Tuesday (10:30-12). If you're around, drop by and say hi! arxiv.org/abs/2207.05739

thumb_up_off_alt85

chat_bubble_outline0

repeat9

shareShare

Saachi Jain

@saachi_jain_

2 years ago

Alright giving this threads thing a shot threads.net/@scoutsaachi

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Andrew Ilyas

@andrew_ilyas

2 years ago

Any burning ML questions? The ATTRIB workshop is hosting a panel on "The Future of Attribution in ML" tomorrow at 11AM and is soliciting questions! Submit them by TODAY 11:59PM to hear answers at the panel tomorrow! forms.gle/Yd5N3Ti6kKfqij… More info: attrib-workshop.cc

thumb_up_off_alt18

chat_bubble_outline0

repeat12

shareShare

Lilian Weng

@lilianweng

a year ago

🍓 Finally o1 is out - our first model with general reasoning capabilities. Not only it achieves impressive results on hard, scientific tasks, but also it gets significantly improved on safety and robustness. openai.com/index/learning… We found reasoning in context about safety

thumb_up_off_alt563

chat_bubble_outline21

repeat56

shareShare

Lilian Weng

@lilianweng

a year ago

📢 We are hiring Research Scientists and Engineers for safety research at OpenAI, ranging from safe model behavior training, adversarial robustness, AI in healthcare, frontier risk evaluation and more. Please fill in this form if you are interested: jobs.ashbyhq.com/openai/form/oa…

thumb_up_off_alt768

chat_bubble_outline20

repeat73

shareShare

Johannes Heidecke

@joheidecke

8 months ago

Proud to share our work on Deliberative Alignment openai.com/index/delibera… with a special shoutout to Melody Guan ʕᵔᴥᵔʔ who led this work. Deliberative Alignment trains models to reason over relevant safety and alignment policies to forge their responses.

thumb_up_off_alt376

chat_bubble_outline1

repeat33

shareShare