Khue Le (@netw0rkf10w) Twitter Tweets • TwiCopy

Khue Le

5 years ago

TIL GitHub has a great new feature: Discussions. Now any repo is basically an online forum. This is really awesome! Separating Discussions from Issues makes it much more easier to manage the content, especially for popular repos.

TIL <a href="/github/">GitHub</a> has a great new feature: Discussions. Now any repo is basically an online forum. This is really awesome! Separating Discussions from Issues makes it much more easier to manage the content, especially for popular repos.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Daniela Witten

@daniela_witten

5 years ago

Remember when ML was a hugely important area w/far-reaching implications in literally every field, and then an ML conference ever-so-slightly changed its name to avoid alienating 50% of ppl, which caused the ML community to collapse & the field to die out? Yeah, neither do I.

thumb_up_off_alt400

chat_bubble_outline6

repeat14

shareShare

Julien Mairal

@julienmairal

4 years ago

Congratulations to Dr. Mathilde Caron Mathilde Caron, who successfully defended her PhD **in person** after a brilliant presentation. The committee was prestigious with Cordelia Schmid, Andrew Zisserman, Alyosha Efros, Diane Larlus, and Alexey Dosovitskiy.

Congratulations to Dr. Mathilde Caron <a href="/mcaron31/">Mathilde Caron</a>, who successfully defended her PhD **in person** after a brilliant presentation. The committee was prestigious with <a href="/CordeliaSchmid/">Cordelia Schmid</a>, Andrew Zisserman, Alyosha Efros, <a href="/dlarlus/">Diane Larlus</a>, and Alexey Dosovitskiy.

thumb_up_off_alt100

chat_bubble_outline4

repeat4

shareShare

Julien Mairal

@julienmairal

4 years ago

Still two days before the deadline for submitting #CVPR2026 proposals! cvpr2022.thecvf.com/tutorials-call…

Still two days before the deadline for submitting <a href="/CVPR/">#CVPR2026</a>
proposals! cvpr2022.thecvf.com/tutorials-call…

thumb_up_off_alt21

chat_bubble_outline0

repeat7

shareShare

Khue Le

@netw0rkf10w

4 years ago

In our NeurIPS 2021 paper (with Karteek Alahari) we showed that CCCP is Frank-Wolfe in disguise. Happy to see other people recently rediscovering this fact and presenting it as a striking result. Want to know another equally striking fact? Mean Field is also Frank-Wolfe!👇

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Khue Le

@netw0rkf10w

4 years ago

An ICLR 2023 submission has been accused of being a rehash of previous work, claim supported by detailed technical arguments. If true then there must be consequences. Intentional misleading contributions should not be tolerated in academic research. openreview.net/forum?id=CQsmM…

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Tri Dao

@tri_dao

3 years ago

We're releasing an optimized implementation of GPT2/GPT3 with FlashAttention🚀! This trains 3-5x faster than the Huggingface version, reaching up to 189 TFLOPs/sec per A100, 60.6% (model) FLOPs util of the theoretical maximum. 1/6 github.com/HazyResearch/f…

thumb_up_off_alt653

chat_bubble_outline15

repeat119

shareShare

Francesco Orabona

@bremen79

3 years ago

FYI the so-called AdaGrad norm stepsize was proposed for the first time in arxiv.org/abs/1002.4862 (see theorem 2) I have seen several papers and talks at #NeurIPS22 citing the wrong work

thumb_up_off_alt34

chat_bubble_outline2

repeat2

shareShare

Tri Dao

@tri_dao

3 years ago

I’ve been working with Inge and we’ve made FlashAttention even faster for long sequences! For seqlen 8K, FlashAttention is now up to 2.7x faster than a standard PyTorch implementation even at small batch, making it easier to train better LMs with longer context 1/7

I’ve been working with <a href="/AdeptAILabs/">Inge</a> and we’ve made FlashAttention even faster for long sequences! For seqlen 8K, FlashAttention is now up to 2.7x faster than a standard PyTorch implementation even at small batch, making it easier to train better LMs with longer context 1/7

thumb_up_off_alt594

chat_bubble_outline7

repeat84

shareShare

Centre Inria de l'Université Grenoble Alpes

@inria_grenoble

3 years ago

[DISTINCTION 🏆] Toutes nos félicitations à Julien Mairal de l'équipe-projet Thoth du centre @inria de l'Université Grenoble Alpes, lauréat d'une bourse European Research Council (ERC) Consolidator Grant 👏 Découvrez-en ➕ ici : inria.fr/fr/julien-mair… #MachineLearning #Algorithm

[DISTINCTION 🏆]

Toutes nos félicitations à <a href="/julienmairal/">Julien Mairal</a> de l'équipe-projet Thoth du centre @inria de l'Université Grenoble Alpes, lauréat d'une bourse <a href="/ERC_Research/">European Research Council (ERC)</a> Consolidator Grant 👏

Découvrez-en ➕ ici :
inria.fr/fr/julien-mair…

#MachineLearning #Algorithm

thumb_up_off_alt71

chat_bubble_outline1

repeat11

shareShare

Khue Le

@netw0rkf10w

3 years ago

Fantastic work! Congrats TimDarcet and colleagues!

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Khue Le

@netw0rkf10w

3 years ago

Interesting ideas of using Optimal Transport for learning to align two sequences of features.

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Guillaume Champeau

@gchampeau

3 years ago

Elle est géniale cette pub d’Orange ! (bon 3 millions de vues je suis sûrement le dernier à la découvrir) youtu.be/D_HPiaAx_QA

thumb_up_off_alt50

chat_bubble_outline9

repeat19

shareShare

Francesco Orabona

@bremen79

3 years ago

New blog post: Yet Another ICML Award Fiasco The story of the ICML Conference 2023 Outstanding Paper Award to the D-Adaptation paper with worse results that the ones from 9 years ago Please share it to start a needed conversation on mistakenly granted awards parameterfree.com/2023/08/30/yet…

thumb_up_off_alt482

chat_bubble_outline17

repeat100

shareShare

Khue Le

@netw0rkf10w

2 years ago

Summary of the rookie mistakes made by the Keras team:

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

Khue Le

@netw0rkf10w

2 years ago

Hi Aaron Defazio. Here's the result of my optimizer, compared to yours (still running). Can you beat my blue curve with hyper-parameter tuning? ;) Please give it a try using this code: github.com/facebookresear…

Hi <a href="/aaron_defazio/">Aaron Defazio</a>. Here's the result of my optimizer, compared to yours (still running). Can you beat my blue curve with hyper-parameter tuning? ;) Please give it a try using this code: github.com/facebookresear…

thumb_up_off_alt120

chat_bubble_outline4

repeat8

shareShare

Khue Le

@netw0rkf10w

2 years ago

While waiting for Aaron Defazio's tuning result, here's my full run of his method (green curve). Interestingly, some modifications inspired by my optimizer seem to boost its performance. Note: MAE's default hyper-params are used for all experiments.

While waiting for <a href="/aaron_defazio/">Aaron Defazio</a>'s tuning result, here's my full run of his method (green curve). Interestingly, some modifications inspired by my optimizer seem to boost its performance.
Note: MAE's default hyper-params are used for all experiments.

thumb_up_off_alt59

chat_bubble_outline8

repeat7

shareShare

Yann LeCun

@ylecun

2 years ago

🥁 Llama3 is out 🥁 8B and 70B models available today. 8k context length. Trained with 15 trillion tokens on a custom-built 24k GPU cluster. Great performance on various benchmarks, with Llam3-8B doing better than Llama2-70B in some cases. More versions are coming over the next

thumb_up_off_alt7,7K

chat_bubble_outline214

repeat1,1K

shareShare