Elan Rosenfeld (@ElanRosenfeld) Twitter Tweets • TwiCopy

Elan Rosenfeld

@ElanRosenfeld

+ Follow

Final year ML PhD Student at CMU working on principled approaches to robustness, security, generalization, and representation learning.

ID:1045447599090794497

linkhttp://cs.cmu.edu/~elan calendar_today27-09-2018 22:58:25

323 Tweets

1,1K Followers

184 Following

ML@CMU

1 month ago

Imagine you're a data scientist who solves several related linear regression problems from the same application domain.

Can you learn how to best use a combination of L1 and L2 regularization penalties? We show that you can! How much data is needed?

blog.ml.cmu.edu/2024/04/12/how…

thumb_up_off_alt15

chat_bubble_outline0

account_circle

Elan Rosenfeld

1 month ago

I can't speak for other schools, but I can tell you this is definitely not the case at CMU. This seems like a surefire way to lose your school's status as a top program.

thumb_up_off_alt47

chat_bubble_outline0

account_circle

Samuel Sokota

2 months ago

SOTA AI for games like poker & Hanabi rely on search methods that don’t scale to games w/ large amounts of hidden information.

In our ICLR paper, we introduce simple search methods that scale to large games & get SOTA for Hanabi w/ 100x less compute. 1/N

arxiv.org/abs/2304.13138

SOTA AI for games like poker & Hanabi rely on search methods that don’t scale to games w/ large amounts of hidden information. In our ICLR paper, we introduce simple search methods that scale to large games & get SOTA for Hanabi w/ 100x less compute. 1/N arxiv.org/abs/2304.13138

thumb_up_off_alt335

chat_bubble_outline0

account_circle

Vaishnavh Nagarajan

2 months ago

🗣️ “Next-token predictors can’t plan!” ⚔️ “False! Every distribution is expressible as product of next-token probabilities!” 🗣️

In work w/ Gregor Bachmann , we carefully flesh out this emerging, fragmented debate & articulate a key new failure. 🔴 arxiv.org/abs/2403.06963

thumb_up_off_alt384

chat_bubble_outline0

account_circle

Mathieu Alain

2 months ago

Mathieu Alain (@miniapeur) on Twitter photo 2024-03-09 04:28:10

thumb_up_off_alt2,9K

chat_bubble_outline0

account_circle

Stat.ML Papers

2 months ago

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models ift.tt/Pui1LcV

thumb_up_off_alt51

chat_bubble_outline0

account_circle

Elan Rosenfeld

3 months ago

Elan Rosenfeld (@ElanRosenfeld) on Twitter photo 2024-02-24 21:35:08

thumb_up_off_alt4

chat_bubble_outline0

account_circle

Francesco Orabona

3 months ago

How you ever wondered why the KL divergence is in all the PAC-Bayes bounds? Are we sure is it the optimal choice?

We now know: for sure KL is *not* the optimal one!

New work with Ilja Kuzborskij, Kwang-Sung Jun , yulian wu, and Kyoungseok Jang

twitter.com/StatMLPapers/s…

thumb_up_off_alt111

chat_bubble_outline0

account_circle

Elan Rosenfeld

3 months ago

A sequence of videos of Will Smith eating spaghetti, overlaid with the shutterstock logo. In some clips he uses a fork and in others his hands overflow with spaghetti as he shovels it into his mouth. In each clip he is wearing a different outfit. One clip has two Will Smiths.

thumb_up_off_alt19

chat_bubble_outline0

account_circle

Elan Rosenfeld

3 months ago

Wow, the two points in the abstract of number (1) are clever and I'm surprised I haven't seen this interpretation before. Is this widely known?

thumb_up_off_alt7

chat_bubble_outline0

account_circle

Elan Rosenfeld

3 months ago

I could've made these predictions with copy+paste and a medium width highlighter

thumb_up_off_alt6

chat_bubble_outline0

account_circle

Elan Rosenfeld

3 months ago

Same, but with all 8s 🥲

thumb_up_off_alt2

chat_bubble_outline0

account_circle

Elan Rosenfeld

4 months ago

Related, possibly controversial:

As a rule of thumb, if the abstract uses the phrase 'We present/propose', I regard it with greater skepticism. It sounds to me like the authors are marketing a product, and you should always be skeptical of marketing.

(skeptical ≠ disbelieving)

thumb_up_off_alt7

chat_bubble_outline0

account_circle

Stat.ML Papers

4 months ago

Conformal Prediction Sets Improve Human Decision Making. (arXiv:2401.13744v1 [cs.LG]) ift.tt/DofTlWt

thumb_up_off_alt64

chat_bubble_outline0

account_circle

Ben Sixsmith

4 months ago

Ben Sixsmith (@BDSixsmith) on Twitter photo 2024-01-23 12:16:04

thumb_up_off_alt9,8K

chat_bubble_outline0

account_circle

Jason Hartford

4 months ago

CARE talks kick off again this week, with
Goutham Rajendran talking about learning disentangled representations (portal.valencelabs.com/events/post/le…). It's a really nice paper showing with linear + Gaussian latents we don't need many interventions to disentangle latents. Thursday, 11am EST

thumb_up_off_alt18

chat_bubble_outline0

account_circle

Zico Kolter

4 months ago

I feel like a lot of people leverage LLMs suboptimally, especially for long-form interactions that span a whole project. So I wrote a VSCode extension that supports what I think is a better use paradigm. 🧵 1/N

Extension: marketplace.visualstudio.com/items?itemName…
Code: github.com/locuslab/chatl…

thumb_up_off_alt337

chat_bubble_outline0

account_circle

Gokul Swamy

4 months ago

I'm excited to share a preview of what I've spent the last few months working on at Google AI: SPO, a new RLHF algorithm with strikingly simple implementation (no reward models) and shockingly strong guarantees (handles messy, intransitive prefs.): arxiv.org/abs/2401.04056

I'm excited to share a preview of what I've spent the last few months working on at @GoogleAI: SPO, a new RLHF algorithm with strikingly simple implementation (no reward models) and shockingly strong guarantees (handles messy, intransitive prefs.): arxiv.org/abs/2401.04056

thumb_up_off_alt788

chat_bubble_outline0

account_circle

fpc ok :)