Aditya Makkar (@makkaraditya) Twitter Tweets • TwiCopy

Aryeh Kontorovich

3 years ago

By popular demand (n=3), a brief 🧵on anti-concentration. 1. Littlewood-Offord-Erdős ecroot.math.gatech.edu/8803/littlewoo… some applications to random matrices are given here: math.iisc.ac.in/~manju/anti-co…

thumb_up_off_alt65

chat_bubble_outline2

repeat9

shareShare

Jascha Sohl-Dickstein

@jaschasd

2 years ago

Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish colors to hyperparameters for which training diverges.

thumb_up_off_alt9,9K

chat_bubble_outline275

repeat1,1K

shareShare

jack morris

@jxmnop

8 months ago

first i thought scaling laws originated in OpenAI (2020) then i thought they came from Baidu (2017) now i am enlightened: Scaling Laws were first explored at Bell Labs (1993)

thumb_up_off_alt1,1K

chat_bubble_outline39

repeat97

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

7 months ago

We're finally out of stealth: percepta.ai We're a research / engineering team working together in industries like health and logistics to ship ML tools that drastically improve productivity. If you're interested in ML and RL work that matters, take a look 😀

thumb_up_off_alt99

chat_bubble_outline15

repeat14

shareShare

Daniel Litt

@littmath

6 months ago

there are more things in heaven and earth, Horatio, than are dreamt of in your RL environment

thumb_up_off_alt143

chat_bubble_outline1

repeat12

shareShare

Aditya Makkar

@makkaraditya

6 months ago

Exciting times!

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Aditya Makkar

@makkaraditya

5 months ago

I like how Timothy Gowers characterized his co-authorship rule:

thumb_up_off_alt380

chat_bubble_outline3

repeat14

shareShare

Federico Vaggi

@f_vaggi

5 months ago

Daniel Litt byorgey.wordpress.com/2009/01/12/abs…

thumb_up_off_alt56

chat_bubble_outline3

repeat3

shareShare

Sam Power

@sp_monte_carlo

5 months ago

With Giorgos Vasdekis, we have written a manuscript - arxiv.org/abs/2511.21563 - which surveys the state of affairs within this literature, outlining principles for improving robustness and detailing examples of contemporary methods which confront these issues.

thumb_up_off_alt9

chat_bubble_outline1

repeat3

shareShare

Eugene Vinitsky 🍒🦋

@eugenevinitsky

5 months ago

I'll be at NeurIPS this year to talk about self-driving, RL, and all the fun bottlenecks to scaling it up. Come chat with myself or my students: Daphne Cornelisse, Aditya Makkar, Riccardo Savorgnan

thumb_up_off_alt62

chat_bubble_outline2

repeat6

shareShare

Gappy (Giuseppe Paleologo)

@__paleologo

4 months ago

1. Ridge regression is heavily used in systematic investing both in the p << n and in the p >> n cases (last one, less so). I don't think that the use is very deeply motivated, other than the old standard argument in favor (see, e.g., El. Stat. Learning).

thumb_up_off_alt375

chat_bubble_outline7

repeat27

shareShare

Almost Sure

@almost_sure

4 months ago

The poll is now closed - the correct answer is: nearly 100% The players almost always end on the same digit! just under 1/3 of you got it correct. The reason is coupling (see included details), giving about a 97.5% chance of them ending on the same digit. Also: see simulation

thumb_up_off_alt29

chat_bubble_outline3

repeat2

shareShare

Almost Sure

@almost_sure

2 months ago

New YouTube video “The Joy of Coupling” Happy Valentine’s Day! (link below)

thumb_up_off_alt24

chat_bubble_outline1

repeat4

shareShare

Yang Liu

@yangpliu

2 months ago

1/ Technical thread on #1stProof Problem 6: finding “spectrally light” vertex subsets in a graph, and how its solution fits into the landscape of spectral sparsification + restricted invertibility. Original thread: x.com/yangpliu/statu…

thumb_up_off_alt113

chat_bubble_outline2

repeat20

shareShare

Mufan Li

@mufan_li

2 months ago

Wasserstein geometry = quotient geometry of permutation invariance. In this blog, I explain why this is the natural language for exchangeable particles—and why mean-field neural network training shows up as a W2 gradient flow. mufan-li.github.io/OT2/

thumb_up_off_alt358

chat_bubble_outline3

repeat47

shareShare

Pedro Domingos

@pmddomingos

2 months ago

At this point LLMs are smarter than the people who call them stochastic parrots.

thumb_up_off_alt738

chat_bubble_outline67

repeat56

shareShare