Brandon Sherman (@shermstats) Twitter Tweets • TwiCopy

Selçuk Korkmaz

5 months ago

ROC AUC is wildly overused in clinical prediction. For imbalanced outcomes, it can look impressive while the model completely fails where it matters. Precision–recall usually tells the real story. journals.plos.org/plosone/articl…

thumb_up_off_alt141

chat_bubble_outline5

repeat28

shareShare

Bartosz Naskręcki

@nasqret

5 months ago

I encourage you to read this article, in which we describe the current situation and the directions in which, in our view, mathematics is heading. Many thanks to Ken Ono for including me in this extraordinary project. I look forward to a wide-ranging discussion and will be

thumb_up_off_alt2,2K

chat_bubble_outline69

repeat456

shareShare

Harsh

@theglobalminima

5 months ago

If you’re getting into PyTorch, give this a read. It discusses the usability, design patterns and implementation ideas behind the framework. A few bits and pieces that can help you build a good foundation.

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat130

shareShare

Nicholas Decker 🏳️‍🌈🌐🇺🇦

@captgouda24

5 months ago

This paper is one of the most astonishing feats of sustained data wizardry I have ever seen. Using data from Uber, they are able to estimate the roughness of every road in America and precisely estimate the value people place on it, and so much more. 1/

thumb_up_off_alt592

chat_bubble_outline7

repeat70

shareShare

Thomas Bloom

@thomasfbloom

5 months ago

Kevin Weil 🇺🇸 Hi, as the owner/maintainer of erdosproblems.com, this is a dramatic misrepresentation. GPT-5 found references, which solved these problems, that I personally was unaware of. The 'open' status only means I personally am unaware of a paper which solves it.

thumb_up_off_alt2,2K

chat_bubble_outline20

repeat153

shareShare

Joachim Schork

@joachimschork

4 months ago

Looking to create stunning, data-rich maps in R? The tidyterra package makes it simple to integrate spatial data with ggplot2, bringing the power of the tidyverse to geospatial analysis. With tidyterra, you can work with spatial data just like any other data set in ggplot2. ✔️

thumb_up_off_alt337

chat_bubble_outline3

repeat60

shareShare

Ryan Briggs

@ryancbriggs

4 months ago

Cool. Literally just yesterday I told my class we couldn’t do this

thumb_up_off_alt689

chat_bubble_outline5

repeat34

shareShare

Jason Abaluck

@jabaluck

4 months ago

And further validation in many field settings: theory says that prices should not differ by more than transport costs, provided price info is available. Here is what happened to fish prices when mobile phones were introduced in Kerala:

thumb_up_off_alt637

chat_bubble_outline7

repeat64

shareShare

Alex Imas

@alexolegimas

4 months ago

Holy s*&t. This paper is insane. You can recover input text from an LLM through inversion. Huge implications for how we understand these models, as well as for things like privacy.

thumb_up_off_alt9,9K

chat_bubble_outline111

repeat543

shareShare

Vinay Tummarakota

@unboxpolitics

4 months ago

Logarithms pose uniquely thorny issues when using difference-in-differences. When the baseline difference between your control group and experiment group is large enough, using a logged dependent variable can actually change the *sign* of the estimated effect!

thumb_up_off_alt241

chat_bubble_outline6

repeat28

shareShare

Dark Roxy

@proxyfreak

3 months ago

he glows because you can pick him up and put him in your inventory

thumb_up_off_alt126,126K

chat_bubble_outline43

repeat9,9K

shareShare

Brandon Sherman

@shermstats

3 months ago

My coworker told me about this paper and it looks really interesting. TL;DR use transfer learning to train a model on new tabular data, get good performance, and confidence intervals (!!!) pmc.ncbi.nlm.nih.gov/articles/PMC11…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Brandon Sherman

@shermstats

2 months ago

offcially got my first bug due to not deep copying a Python dictionary properly

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Andrew Gelman et al.

@statmodeling

a month ago

“Coding for humans: Best practices for writing software people can read” statmodeling.stat.columbia.edu/2026/01/17/cod…

thumb_up_off_alt40

chat_bubble_outline1

repeat13

shareShare

Cesar Chavez

@cesarchavezp29

a month ago

Every student learns that correlation does not imply causation. Few learn the converse: absence of correlation does not imply absence of causation. This essay traces how economics came to think about causality. The story involves philosophers, statisticians, econometricians, and

thumb_up_off_alt1,1K

chat_bubble_outline15

repeat285

shareShare

Andrew Gelman et al.

@statmodeling

22 days ago

A consensus–but it’s a consensus of uncertainty. statmodeling.stat.columbia.edu/2026/02/09/a-c…

thumb_up_off_alt10

chat_bubble_outline0

repeat3

shareShare

Séb Krier

@sebkrier

18 days ago

Fascinating insights from senior engineers on how AI is changing their jobs. Interesting how automation also creates all sorts of new tasks and bottlenecks. thoughtworks.com/content/dam/th…

thumb_up_off_alt407

chat_bubble_outline5

repeat53

shareShare

Andrew Gelman et al.

@statmodeling

12 days ago

The 80% power lie statmodeling.stat.columbia.edu/2026/02/19/the…

thumb_up_off_alt25

chat_bubble_outline0

repeat7

shareShare

Andy Hall

@ahall_research

12 days ago

AI is about to write thousands of papers. Will it p-hack them? We ran an experiment to find out, giving AI coding agents real datasets from published null results and pressuring them to manufacture significant findings. It was surprisingly hard to get the models to p-hack, and

thumb_up_off_alt980

chat_bubble_outline50

repeat259

shareShare

quant

@quant_____

3 days ago

Keynes invents diff-in-diff, February 1911

thumb_up_off_alt387

chat_bubble_outline11

repeat45

shareShare