Jingfeng Wu (@uuujingfeng) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Yen-Huan Li

@yenhuan_li

a year ago

==== My recommendations today ==== Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency arxiv.org/abs/2402.15926 (1/2)

thumb_up_off_alt28

chat_bubble_outline2

repeat6

shareShare

What’s the role of the MLP layer in a transformer block? It’s intuitive to think that the MLP component helps reduce the approximation error, and our new paper confirms this theoretically! arxiv.org/abs/2402.14951 Joint work with Jingfeng Wu Jingfeng Wu and Peter L. Bartlett

thumb_up_off_alt42

chat_bubble_outline2

repeat5

shareShare

Bin Yu

@bbiinnyyuu

a year ago

My co-author Rebecca Barter and I are thrilled to announce the online release of our MIT Press book "Veridical Data Science: The Practice of Responsible Data Analysis and Decision Making" (vdsbook.com), an essential source for producing trustworthy data-driven results.

thumb_up_off_alt192

chat_bubble_outline6

repeat58

shareShare

Weijie Su

@weijie444

a year ago

📢 #ICML2024 authors! Help improve ML peer review! 🔬📝 Check your inbox for an email titled "[ICML 2024] Author Survey" and rank your submissions. 🏆📈 Your confidential input is crucial, and won't affect decisions. 🔒✅ Survey link in email or "Author Tasks" on OpenReview.

thumb_up_off_alt53

chat_bubble_outline0

repeat18

shareShare

The Abel Prize

@abel_prize

a year ago

#AbelPrize #AbelPrize2024

thumb_up_off_alt1,1K

chat_bubble_outline28

repeat427

shareShare

Song Mei

@song__mei

a year ago

My group at Berkeley Stats and EECS has a postdoc opening in the theoretical (e.g., scaling laws, watermark) and empirical aspects (e.g., efficiency, safety, alignment) of LLMs or diffusion models. Send me an email with your CV if interested!

thumb_up_off_alt94

chat_bubble_outline0

repeat23

shareShare

Gabriel Peyré

@gabrielpeyre

a year ago

Oldies but goldies: H Robbins, S Monro, A Stochastic Approximation Method, 1951. Early appearance of the stochastic gradient method, which is the workhorse of many large-scale ML methods. en.wikipedia.org/wiki/Stochasti… en.wikipedia.org/wiki/Stochasti…

thumb_up_off_alt431

chat_bubble_outline3

repeat72

shareShare

Woodson Lab

@woodson_lab

a year ago

Congratulations to Dr. Yuan Lou for a fantastic thesis defense! 🎉🎉 She is an exceptional scientist and colleague, and we will miss her dearly. Best of luck at Genentech!

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

Yisong Yue

@yisongyue

10 months ago

Just updated my Tips for CS Faculty Applications. Best of luck to everyone applying! yisongyue.medium.com/checklist-of-t…

thumb_up_off_alt411

chat_bubble_outline4

repeat72

shareShare

Jingfeng Wu

@uuujingfeng

10 months ago

Out of the 6 NeurIPS submissions I reviewed this year, 3 were withdrawn and 3 were rejected. Honestly, my experience as a reviewer has worsened more than my experience as an author.

thumb_up_off_alt73

chat_bubble_outline7

repeat2

shareShare

Gabriel Peyré

@gabrielpeyre

10 months ago

The perspective transform turns a 1D convex function into a 2D positively homogeneous convex function. Fundamental in convex analysis. At the heart of Cizsar divergences. math.univ-toulouse.fr/Archive-MIP/pu…

thumb_up_off_alt285

chat_bubble_outline4

repeat36

shareShare

Lechao Xiao

@locchiu

10 months ago

1/5. Excited to share a spicy paper, "Rethinking conventional wisdom in machine learning: from generalization to scaling", arxiv.org/pdf/2409.15156. You might love it or dislike it! NotebookLM: notebooklm.google.com/notebook/43f11… While double-descent (generalization-centric,

thumb_up_off_alt119

chat_bubble_outline2

repeat32

shareShare

Simons Institute for the Theory of Computing

@simonsinstitute

9 months ago

Deep learning practitioners have focused their attention on an optimization regime that's "unstable and convergent"--something that's not suggested by theory when using gradient methods, says Peter Bartlett, during his Richard M. Karp Distinguished Lecture at the Simons Institute

thumb_up_off_alt91

chat_bubble_outline2

repeat10

shareShare

Sham Kakade

@shamkakade6

9 months ago

(1/n) 💡How can we speed up the serial runtime of long pre-training runs? Enter Critical Batch Size (CBS): the tipping point where the gains of data parallelism balance with diminishing efficiency. Doubling batch size halves the optimization steps—until we hit CBS, beyond which

thumb_up_off_alt137

chat_bubble_outline3

repeat32

shareShare

Year Progress

@year_progress

7 months ago

2025 is 0% complete.

thumb_up_off_alt93,93K

chat_bubble_outline366

repeat12,12K

shareShare

Francesco Orabona

@bremen79

6 months ago

Jingfeng Wu I have not such a deep knowledge of the history of ML to be able to answer this in a definitive way, so I'll just give you my very personal point of view. I entered the ML field in the peak of the SVM era. At that time, people tended to use theory as a way to design algorithms.

thumb_up_off_alt35

chat_bubble_outline2

repeat7

shareShare

Yuhang Cai

@yuhangwillcai

5 months ago

We show the implicit bias of GD for generic non-homogeneous deep nets (results of such were previously limited to homogenous ones). In particular, our results cover those with residual connections and non-homogeneous activation functions. It's a joint work with Kangjie Zhou,

thumb_up_off_alt39

chat_bubble_outline0

repeat5

shareShare

Association for Computing Machinery

@theofficialacm

5 months ago

Meet the recipients of the 2024 ACM A.M. Turing Award, Andrew G. Barto and Richard S. Sutton! They are recognized for developing the conceptual and algorithmic foundations of reinforcement learning. Please join us in congratulating the two recipients! bit.ly/4hpdsbD

thumb_up_off_alt1,1K

chat_bubble_outline35

repeat479

shareShare

Jingfeng Wu

Gate.io

Yen-Huan Li

Ruiqi Zhang

Bin Yu

Weijie Su

The Abel Prize

Song Mei

Gabriel Peyré

Woodson Lab

Yisong Yue

Jingfeng Wu

Gabriel Peyré

Lechao Xiao

Simons Institute for the Theory of Computing

Sham Kakade

Year Progress

Francesco Orabona

Yuhang Cai

Association for Computing Machinery