Ajitesh Shukla (@ajitesh_shukla7) Twitter Tweets • TwiCopy

Sam Power

6 days ago

In any case, please do take a look at the paper (arxiv.org/abs/2503.14347); it's nice and short, and could be a useful tool to keep in your toolbox.

thumb_up_off_alt11

chat_bubble_outline1

repeat1

shareShare

Sam Power Zishun Liu Yongxin Chen I looked at the definition of the averaged moment generating function and this reminds me of the method of mixtures that I am a big fan of and which goes back to de la Pena et al. See eg sites.ualberta.ca/~szepesva/pape… and references. Looks cool what you did though and slightly different

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Mattes Mollenhauer

@gaussianmeasure

5 days ago

Sam Power Zishun Liu Yongxin Chen Nice! Related and potentially of interest: arxiv.org/abs/2306.11404 If you express the subgaussian proxy not uniformly, but dimension-wise in terms of a psd operator, you can obtain a sharp bound that also holds for vectors in infinite-dim Hilbert spaces.

<a href="/sp_monte_carlo/">Sam Power</a> <a href="/zliuPhD/">Zishun Liu</a> <a href="/YongxinChen1/">Yongxin Chen</a> Nice! Related and potentially of interest: arxiv.org/abs/2306.11404

If you express the subgaussian proxy not uniformly, but dimension-wise in terms of a psd operator, you can obtain a sharp bound that also holds for vectors in infinite-dim Hilbert spaces.

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

Oliver Maclaren

@omaclaren

3 days ago

Teaching some matrix calculus at the moment, and mostly but not fully satisfied with my notes at the moment...found these, they look very good! Gives me some nice ideas to improve my own notes -- Matrix Calculus (for Machine Learning and Beyond) arxiv.org/abs/2501.14787

thumb_up_off_alt449

chat_bubble_outline3

repeat50

shareShare

Dr. Chris Rackauckas

@chrisrackauckas

3 days ago

Hot take: vibe coding isn't for those who don't know how to code, it's for the experts. A perspective on the true role of Generative AI and LLM-Based Vibe Coding for the future of development. stochasticlifestyle.com/a-guide-to-gen… #GenerativeAI #llm Claude ChatGPT #vibecoding

thumb_up_off_alt39

chat_bubble_outline3

repeat11

shareShare

Yao Fu

@francis_yao_

2 days ago

JingyuanLiu jax-ml.github.io/scaling-book/ This books explains well, especially chapter 2, 5 and 12

thumb_up_off_alt38

chat_bubble_outline0

repeat6

shareShare

Piotr Pomorski

@ptrpomorski

2 days ago

"Is Gold an Inflation Hedge?" -> it used to be decades ago, but now it's just for large shocks. papers.ssrn.com/sol3/papers.cf…

thumb_up_off_alt21

chat_bubble_outline3

repeat3

shareShare

QuantSeeker

@quantseeker

2 days ago

New paper by Daniel Bloch: Fast trading signals often look like alpha but are mostly small-sample noise. What seems like “speed” is usually just reacting faster to randomness. papers.ssrn.com/sol3/papers.cf…

thumb_up_off_alt71

chat_bubble_outline2

repeat12

shareShare

Piotr Pomorski

@ptrpomorski

2 days ago

"The Science and Practice of Trend-following Systems", great one, especially that we currently work on updating our TF system. It's 44 pages, so probably better to run through it using Benjamin AI papers.ssrn.com/sol3/papers.cf…

thumb_up_off_alt49

chat_bubble_outline1

repeat2

shareShare

Horace He

@chhillee

2 days ago

Seunghyun Seo Daniel Vega-Myhre JingyuanLiu I don't quite think about it as you do. A couple notes: 1. I wouldn't say FSDP doesn't reduce activation memory. When doing these comparisons, it makes sense to keep gbsz fixed, and in that case, swapping TP=>DP lowers batch size per GPU, lowering activation memory. 2. All gather

thumb_up_off_alt15

chat_bubble_outline1

repeat2

shareShare

Quanquan Gu

@quanquangu

a day ago

So many multipliers! Great to see that Grok2 was trained using μP. huggingface.co/xai-org/grok-2

thumb_up_off_alt172

chat_bubble_outline6

repeat21

shareShare

Heming Xia

@hemingkx

a day ago

🎉Excited to share that TokenSkip has been accepted to the main conference of EMNLP 2025! Many thanks to all the coauthors for their hard work! Looking forward to seeing everyone in Suzhou😉. arxiv.org/abs/2502.12067

thumb_up_off_alt73

chat_bubble_outline0

repeat11

shareShare

Chao Huang

@huang_chao4969

a day ago

🔥 DeepCode has been trending on GitHub for 2 consecutive days! 🚀 Almost hitting 2k GitHub Stars! 🌟 Fully Open Source: github.com/HKUDS/DeepCode ✨ All-in-One Agentic Coding Framework ✨ • 📄 Paper2Code - Research to Implementation • 🌐 Text2Web - Natural Language to Frontend

thumb_up_off_alt867

chat_bubble_outline11

repeat191

shareShare

Qingyu

@qingyu_shi_

a day ago

The 3rd Universal Cup Semifinals is coming! Live scoreboard: qoj.ac/results/Semifi… (currently Warm-up Contest) The deadline of registering new teams is Aug 24 at 18:30 (UTC).

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

elie

@eliebakouch

21 hours ago

Didn't realize but it's easy to see if open source model are using "original" muP by looking at the config. For instance in grok1/2 there is this "input/output multiplier_scale" which correspond to the alpha input/output in the original muP. Looking at the transformers modeling

thumb_up_off_alt48

chat_bubble_outline6

repeat3

shareShare

AIDB

@ai_database

20 hours ago

アメリカの核兵器研究で有名なロス・アラモス国立研究所が「科学者AI」を開発。人間のように論文を読み、実験を計画し、シミュレーションを実行して、科学的発見を自動化しようとするシステムとのことです。このAIは核融合実験の設計において、従来手法よりかなり効率的だったと報告されています。

thumb_up_off_alt228

chat_bubble_outline5

repeat73

shareShare

José A. Alonso

@jose_a_alonso

20 hours ago

LeanGeo: Formalizing competitional geometry problems in Lean. ~ Chendong Song et als. arxiv.org/abs/2508.14644 #ITP #LeanProver #Math #LLMs

thumb_up_off_alt23

chat_bubble_outline0

repeat1

shareShare

José A. Alonso

@jose_a_alonso

20 hours ago

To zip through the cost analysis of probabilistic programs. ~ Matthias Hetzenberger, Georg Moser, Florian Zuleger. arxiv.org/abs/2508.14249… #Haskell #FunctionalProgramming

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

maspy

@maspy_stars

20 hours ago

The 3rd Universal Cup Semifinals YouTube: youtube.com/@Universal_Cup Bilibili: space.bilibili.com/35466280089706… Feel free to watch it, no matter whether you are a participant or not. なので問題解説等は無いはず。参加者画面も録画はあれど配信はないはず。順位表だけで5時間持たせるのかな？

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Simons Institute for the Theory of Computing

@simonsinstitute

19 hours ago

1/2 "Fundamentally, modern AI is just a mathematical object. Math is transforming the world...especially with respect to modern AI." Mikhail Belkin of UC San Diego at the Simons Institute on the triumph and failure of mathematics as it relates to AI. Video: simons.berkeley.edu/talks/mikhail-…

thumb_up_off_alt70

chat_bubble_outline2

repeat12

shareShare

Ajitesh Shukla

Sam Power

Csaba Szepesvari

Mattes Mollenhauer

Oliver Maclaren

Dr. Chris Rackauckas

Yao Fu

Piotr Pomorski

QuantSeeker

Piotr Pomorski

Horace He

Quanquan Gu

Heming Xia

Chao Huang

Qingyu

elie

AIDB

José A. Alonso

José A. Alonso

maspy

Simons Institute for the Theory of Computing