David Marx (@digthatdata.bsky.social) (@digthatdata) 's Twitter Profile
David Marx (@digthatdata.bsky.social)

@digthatdata

Generative AI MLE, FOSS toolmaker, innovation catalyst @CoreWeave + @AiEleuther. bsky.app/profile/digtha…

ID: 2211601081

linkhttps://github.com/dmarx calendar_today24-11-2013 00:56:11

10,10K Tweet

4,4K Followers

1,1K Following

Dylan Foster 🐢 (@canondetortugas) 's Twitter Profile Photo

Is KL-regularization the right tool for language model alignment? The χPO algorithm: We show that a one-line change to DPO—moving from KL to chi-squared regularization—is sufficient to achieve state-of-the-art theoretical guarantees, provably alleviating over-optimization.

Is KL-regularization the right tool for language model alignment? 

The χPO algorithm: We show that a one-line change to DPO—moving from KL to chi-squared regularization—is sufficient to achieve state-of-the-art theoretical guarantees, provably alleviating over-optimization.
Tanishq Kumar (@tanishqkumar07) 's Twitter Profile Photo

[1/7] New paper alert! Heard about the BitNet hype or that Llama-3 is harder to quantize? Our new work studies both! We formulate scaling laws for precision, across both pre and post-training arxiv.org/pdf/2411.04330. TLDR; - Models become harder to post-train quantize as they

[1/7] New paper alert! Heard about the BitNet hype or that Llama-3 is harder to quantize? Our new work studies both! We formulate scaling laws for precision, across both pre and post-training arxiv.org/pdf/2411.04330. TLDR;

- Models become harder to post-train quantize as they
Rohan Choudhury (@rchoudhury997) 's Twitter Profile Photo

Excited to finally release our NeurIPS 2024 (spotlight) paper! We introduce Run-Length Tokenization (RLT), a simple way to significantly speed up your vision transformer on video with no loss in performance!

LAION (@laion_ai) 's Twitter Profile Photo

We announce LAION-DISCO-12M - a collection of 12 million links to publicly available YouTube samples paired with metadata to support basic machine learning research in foundation models for generic audio and music. laion.ai/blog/laion-dis…

Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

You can choose what feeds you have on your homepage -- I personally like the "popular with friends" feed (in the above tweet). The default is "following" - which looks like this for me:

You can choose what feeds you have on your homepage -- I personally like the "popular with friends" feed (in the above tweet). The default is "following" - which looks like this for me:
Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

Because X has tended to censor discussion of social networks I won't link directly, but look for this post to get an instant AI/ML feed thanks to M A Osborne

Because X has tended to censor discussion of social networks I won't link directly, but look for this post to get an instant AI/ML feed thanks to <a href="/maosbot/">M A Osborne</a>
Fern (@hi_tysam) 's Twitter Profile Photo

New NanoGPT training speed record: 3.28 FineWeb val loss in 4.66 minutes Previous record: 5.03 minutes Changelog: - FlexAttention blocksize warmup - hyperparameter tweaks

New NanoGPT training speed record: 3.28 FineWeb val loss in 4.66 minutes

Previous record: 5.03 minutes
Changelog: 
- FlexAttention blocksize warmup
- hyperparameter tweaks
Haiwen Huang (@haiwenhuang_) 's Twitter Profile Photo

🔥 Are you ever dissatisfied with the imprecise names in vision-language datasets? 🚀 At #NeurIPS2024, we introduce 𝐑𝐄𝐍𝐎𝐕𝐀𝐓𝐄, showing how better segmentation dataset names lead to 𝐛𝐞𝐭𝐭𝐞𝐫 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 & 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧. Let’s dive in! 🧵👇

🔥 Are you ever dissatisfied with the imprecise names in vision-language datasets?

🚀 At #NeurIPS2024, we introduce 𝐑𝐄𝐍𝐎𝐕𝐀𝐓𝐄, showing how better segmentation dataset names lead to 𝐛𝐞𝐭𝐭𝐞𝐫 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 &amp; 𝐞𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧.

Let’s dive in! 🧵👇
David Marx (@digthatdata.bsky.social) (@digthatdata) 's Twitter Profile Photo

> Republicans: "We love America! It has the greatest system of governance. Look, I even carry the constitution next to my heart like a little bible :*) " > Also republicans: "DISMANTLE THE GOVERNMENT! FUCK THE SEPARATION OF POWERS! GOD KING PRESIDENT CULT OF PERSONALITY!"

anton (@atroyn) 's Twitter Profile Photo

'we're in this bizarre world where the best way to learn about llms... is to read papers by chinese companies. i do not think this is a good state of the world' - us labs keeping their architectures and algorithms secret is ultimately hurting ai development in the us.