David Marx (@digthatdata.bsky.social) (@digthatdata) 's Twitter Profile
David Marx (@digthatdata.bsky.social)

@digthatdata

Generative AI MLE, FOSS toolmaker, innovation catalyst @CoreWeave + @AiEleuther. bsky.app/profile/digthaโ€ฆ

ID: 2211601081

linkhttps://github.com/dmarx calendar_today24-11-2013 00:56:11

10,10K Tweet

4,4K Followers

1,1K Following

Dylan Foster ๐Ÿข (@canondetortugas) 's Twitter Profile Photo

Is KL-regularization the right tool for language model alignment? The ฯ‡PO algorithm: We show that a one-line change to DPOโ€”moving from KL to chi-squared regularizationโ€”is sufficient to achieve state-of-the-art theoretical guarantees, provably alleviating over-optimization.

Is KL-regularization the right tool for language model alignment? 

The ฯ‡PO algorithm: We show that a one-line change to DPOโ€”moving from KL to chi-squared regularizationโ€”is sufficient to achieve state-of-the-art theoretical guarantees, provably alleviating over-optimization.
Tanishq Kumar (@tanishqkumar07) 's Twitter Profile Photo

[1/7] New paper alert! Heard about the BitNet hype or that Llama-3 is harder to quantize? Our new work studies both! We formulate scaling laws for precision, across both pre and post-training arxiv.org/pdf/2411.04330. TLDR; - Models become harder to post-train quantize as they

[1/7] New paper alert! Heard about the BitNet hype or that Llama-3 is harder to quantize? Our new work studies both! We formulate scaling laws for precision, across both pre and post-training arxiv.org/pdf/2411.04330. TLDR;

- Models become harder to post-train quantize as they
Rohan Choudhury (@rchoudhury997) 's Twitter Profile Photo

Excited to finally release our NeurIPS 2024 (spotlight) paper! We introduce Run-Length Tokenization (RLT), a simple way to significantly speed up your vision transformer on video with no loss in performance!

LAION (@laion_ai) 's Twitter Profile Photo

We announce LAION-DISCO-12M - a collection of 12 million links to publicly available YouTube samples paired with metadata to support basic machine learning research in foundation models for generic audio and music. laion.ai/blog/laion-disโ€ฆ

Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

You can choose what feeds you have on your homepage -- I personally like the "popular with friends" feed (in the above tweet). The default is "following" - which looks like this for me:

You can choose what feeds you have on your homepage -- I personally like the "popular with friends" feed (in the above tweet). The default is "following" - which looks like this for me:
Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

Because X has tended to censor discussion of social networks I won't link directly, but look for this post to get an instant AI/ML feed thanks to M A Osborne

Because X has tended to censor discussion of social networks I won't link directly, but look for this post to get an instant AI/ML feed thanks to <a href="/maosbot/">M A Osborne</a>
Fern (@hi_tysam) 's Twitter Profile Photo

New NanoGPT training speed record: 3.28 FineWeb val loss in 4.66 minutes Previous record: 5.03 minutes Changelog: - FlexAttention blocksize warmup - hyperparameter tweaks

New NanoGPT training speed record: 3.28 FineWeb val loss in 4.66 minutes

Previous record: 5.03 minutes
Changelog: 
- FlexAttention blocksize warmup
- hyperparameter tweaks
Haiwen Huang (@haiwenhuang_) 's Twitter Profile Photo

๐Ÿ”ฅ Are you ever dissatisfied with the imprecise names in vision-language datasets? ๐Ÿš€ At #NeurIPS2024, we introduce ๐‘๐„๐๐Ž๐•๐€๐“๐„, showing how better segmentation dataset names lead to ๐›๐ž๐ญ๐ญ๐ž๐ซ ๐ญ๐ซ๐š๐ข๐ง๐ข๐ง๐  & ๐ž๐ฏ๐š๐ฅ๐ฎ๐š๐ญ๐ข๐จ๐ง. Letโ€™s dive in! ๐Ÿงต๐Ÿ‘‡

๐Ÿ”ฅ Are you ever dissatisfied with the imprecise names in vision-language datasets?

๐Ÿš€ At #NeurIPS2024, we introduce ๐‘๐„๐๐Ž๐•๐€๐“๐„, showing how better segmentation dataset names lead to ๐›๐ž๐ญ๐ญ๐ž๐ซ ๐ญ๐ซ๐š๐ข๐ง๐ข๐ง๐  &amp; ๐ž๐ฏ๐š๐ฅ๐ฎ๐š๐ญ๐ข๐จ๐ง.

Letโ€™s dive in! ๐Ÿงต๐Ÿ‘‡
David Marx (@digthatdata.bsky.social) (@digthatdata) 's Twitter Profile Photo

> Republicans: "We love America! It has the greatest system of governance. Look, I even carry the constitution next to my heart like a little bible :*) " > Also republicans: "DISMANTLE THE GOVERNMENT! FUCK THE SEPARATION OF POWERS! GOD KING PRESIDENT CULT OF PERSONALITY!"

anton (@atroyn) 's Twitter Profile Photo

'we're in this bizarre world where the best way to learn about llms... is to read papers by chinese companies. i do not think this is a good state of the world' - us labs keeping their architectures and algorithms secret is ultimately hurting ai development in the us.