rando_g (@stats_cgao) 's Twitter Profile
rando_g

@stats_cgao

lifelong student in statistics, wanna be bayesian

ID: 799532855080681472

calendar_today18-11-2016 08:41:02

329 Tweet

66 Followers

331 Following

Timothy Gowers @wtgowers (@wtgowers) 's Twitter Profile Photo

Google DeepMind have produced a program that in a certain sense has achieved a silver-medal peformance at this year's International Mathematical Olympiad. 🧵 deepmind.google/discover/blog/…

Stanislas Polu (@spolu) 's Twitter Profile Photo

AlphaProof[0] is a massively impressive result. It seems to consist in: (i) A lot of data: 1m informal problems (ii) Autoformalization to 100m problems and sub-problems (iii) AlphaZero training loop on proving/disproving 100m versions Interesting to decompose each and see

xjdr (@_xjdr) 's Twitter Profile Photo

- GDM is now leading the AGI race - Llama3.1 changed everything and Llama4 is the most important model in the world right now in terms of potential impact (short of AGI has been achieved internally announcements) - real talk, if Character.ai with Noam can't make it on

- GDM is now leading the AGI race
- Llama3.1 changed everything and Llama4 is the most important model in the world right now in terms of potential impact (short of AGI has been achieved internally announcements) 
- real talk, if Character.ai with Noam can't make it on
Andrew Gelman et al. (@statmodeling) 's Twitter Profile Photo

Why are we making probabilistic election forecasts? (and why don’t we put so much effort into them?) statmodeling.stat.columbia.edu/2024/08/30/why…

OpenRouter (@openrouterai) 's Twitter Profile Photo

Here you can see that: - o1 usually uses more tokens for reasoning compared to completion (edited the graph to clarify) - the median number of reasoning tokens is relatively constant regardless of prompt size p.us5.datadoghq.com/sb/7dc1ecb4-86…

Andrew Gelman et al. (@statmodeling) 's Twitter Profile Photo

Interpreting recent Iowa election poll using a rough Bayesian partition of error statmodeling.stat.columbia.edu/2024/11/03/cru…

Laura Ruis (@lauraruis) 's Twitter Profile Photo

Our findings show that models reason by applying procedural knowledge from similar cases seen during pretraining. This suggests we don't need to cover every possible case in pretraining! Focusing on high-quality, diverse procedural data could be more effective.

宝玉 (@dotey) 's Twitter Profile Photo

让 Gemini 帮我分析 14 万行混淆后的 js 代码 前几天测试一个视频工具网站,发现它生成视频缩略图的速度特别快,因为我以前也做过,是基于 ffmpeg 命令行生成的,网页也能执行,但是耗时比较长一点,我就好奇它是怎么实现的,于是就去看它的 js 代码,但是混淆后代码文件太长了,主文件达到了 14

John Schulman (@johnschulman2) 's Twitter Profile Photo

Amjad Masad David Sacks Nope, we don't know how to train models to reason about controversial topics from first principles; we can only train them to reason on tasks like math calculations and puzzles where there's an objective ground truth answer. On general tasks, we only know how to train them to

Nick HK (@nickchk) 's Twitter Profile Photo

New working paper out today with Eleanor Murray, "Do LLMs Act as Repositories of Causal Knowledge?" Can LLMs (like ChatGPT) build for us the causal models we need to identify an effect? There are reasons to expect they could. But can they? Well, not really, no.

Peyman Milanfar (@docmilanfar) 's Twitter Profile Photo

Michael Jordan gave a short, excellent, and provocative talk recently in Paris - here's a few key ideas - It's all just machine learning (ML) - the AI moniker is hype - The late Dave Rumelhart should've received a Nobel prize for his early ideas on making backprop work 1/n

Michael Jordan gave a short, excellent, and provocative talk recently in Paris - here's a few key ideas

- It's all just machine learning (ML) - the AI moniker is hype 

- The late Dave Rumelhart should've received a Nobel prize for his early ideas on making backprop work

1/n
rando_g (@stats_cgao) 's Twitter Profile Photo

Chorus from Charlie Holtz is all you need for all the core chat functionality. Huge fan of the legacy mode. I also think the old synthesis has the potential to become some sort ‘merge’ like o1 pro. Not sure how I feel about AI reviews not for my main use cases atm

rando_g (@stats_cgao) 's Twitter Profile Photo

Gemini deep research can be so much more useful if it can be opened in NotebookLM, with the cited sources auto imported as the knowledge base for further study Logan Kilpatrick Josh Woodward

Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

Nearly every paper I’ve tweeted this past month happened to be from China or Chinese diaspora researchers. ByteDance Seed is just one cluster, and Google is an exception. Chinese people are winning in open science for now.

Zack Witten (@zswitten) 's Twitter Profile Photo

Today I’d like to tell the tale of how an innocent member of Anthropic technical staff summoned from the void a fictional 9,000-pound hippo named Gustav, and the chaos this hippo wrought. 🧵

Andrew Gelman et al. (@statmodeling) 's Twitter Profile Photo

It’s Sapolsky time: About that bogus claim that “chess grandmasters” burn 6000 calories per day statmodeling.stat.columbia.edu/2025/06/30/its…

Dr. Dominic Ng (@drdominicng) 's Twitter Profile Photo

Microsoft claims their new AI framework diagnoses 4x better than doctors. I'm a medical doctor and I actually read the paper. Here's my perspective on why this is both impressive AND misleading ... 🧵

Microsoft claims their new AI framework diagnoses 4x better than doctors.

I'm a medical doctor and I actually read the paper. Here's my perspective on why this is both impressive AND misleading ... 🧵
Machine Learning Street Talk (@mlstreettalk) 's Twitter Profile Photo

Had a fascinating chat with New York University Professor Andrew Gordon Wilson (Andrew Gordon Wilson) about his paper "Deep Learning is Not So Mysterious or Different." We dug into some of the biggest paradoxes in modern AI. If you've ever wondered how these giant models actually work, this is