Sicong (Sheldon) Huang (@sicong_huang) 's Twitter Profile
Sicong (Sheldon) Huang

@sicong_huang

Pretrained by evolution, finetuned by experience, prompted by situations. | AI PhDing @UofT, sharing ideas on AI, forecasting research and the human condition.

ID: 3720756561

linkhttps://www.cs.toronto.edu/~huang/ calendar_today20-09-2015 19:46:52

588 Tweet

2,2K Followers

695 Following

Sicong (Sheldon) Huang (@sicong_huang) 's Twitter Profile Photo

Also a great place to meet friends. I'd guess most students there are conscientious and highly open. Could be your future boss or employee. Or just friends to grow together.

Sicong (Sheldon) Huang (@sicong_huang) 's Twitter Profile Photo

I think inference time compute could be more efficient for scaling sequential decision making instead of only using the depth of a NN, so that during train time the model is (in weight) learning to search (in context). But it seems human researchers are still deciding which parts

Sicong (Sheldon) Huang (@sicong_huang) 's Twitter Profile Photo

i was like, this AI timeline seems more based than a lot of the more serious-looking ones i've seen — and then bro shared his chatgpt conversation history...

Sicong (Sheldon) Huang (@sicong_huang) 's Twitter Profile Photo

Are LLMs ready for scientific hypothesis generation and inductive reasoning? We built HypoBench to put them to the test. See what we found. 👇

Ruiqi Zhong (@zhongruiqi) 's Twitter Profile Photo

Last day of PhD! I pioneered using LLMs to explain dataset&model. It's used by interp at OpenAI and societal impact Anthropic Tutorial here. It's a great direction & someone should carry the torch :) Thesis available, if you wanna read my acknowledgement section=P

Last day of PhD! 

I pioneered using LLMs to explain dataset&amp;model. It's used by interp at <a href="/OpenAI/">OpenAI</a>  and societal impact <a href="/AnthropicAI/">Anthropic</a> 

Tutorial here. It's a great direction &amp; someone should carry the torch :)

Thesis available, if you wanna read my acknowledgement section=P
John(Yueh-Han) Chen (@jcyhc_ai) 's Twitter Profile Photo

New paper: We developed an LLM system that predicts which machine learning research idea from a set of candidates will yield superior empirical results. We showed that, in certain domains like NLP, our system significantly outperforms human experts (64.4% vs. 48.9%)! See more

New paper: We developed an LLM system that predicts which machine learning research idea from a set of candidates will yield superior empirical results.

We showed that, in certain domains like NLP, our system significantly outperforms human experts (64.4% vs. 48.9%)!

See more
Daniel Wurgaft (@danielwurgaft) 's Twitter Profile Photo

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

Sid (@sid_srk) 's Twitter Profile Photo

Announcing The Toronto School Of Foundation Modelling, a Toronto exclusive, in-person only school for learning to build Foundation Models. Coming to New Stadium and Youthful Vengeance in late August 2025.

Ken Liu (@kenziyuliu) 's Twitter Profile Photo

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.

Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation &amp; community verification. LLMs solved ~10/500 so far:
Hugo Larochelle (@hugo_larochelle) 's Twitter Profile Photo

Excited to share that I begin today as Scientific Director at Mila - Institut québécois d'IA! Truly honored by this opportunity to serve this community of AI leaders and innovators, that I've always cherished and have benefited from myself. mila.quebec/en/news/hugo-l…

François Chollet (@fchollet) 's Twitter Profile Photo

The most important skill for a researcher is not technical ability. It's taste. The ability to identify interesting and tractable problems, and recognize important ideas when they show up. This can't be taught directly. It's cultivated through curiosity and broad reading.

François Chollet (@fchollet) 's Twitter Profile Photo

The idea that we will automate work by building artificial versions of ourselves to do exactly the things we were previously doing, rather than redesigning our old workflows to make the most out of existing automation technology, has a distinct “mechanical horse” flavor

Danijar Hafner (@danijarh) 's Twitter Profile Photo

Excited to introduce Dreamer 4, an agent that learns to solve complex control tasks entirely inside of its scalable world model! 🌎🤖 Dreamer 4 pushes the frontier of world model accuracy, speed, and learning complex tasks from offline datasets. co-led with Wilson Yan

Forecasting Research Institute (@research_fri) 's Twitter Profile Photo

⬆️ LLMs’ forecasting abilities are steadily improving. GPT-4 (released March 2023) achieved a difficulty-adjusted Brier score of 0.131. Nearly two years later, GPT-4.5 (released Feb 2025) scored 0.101—a substantial improvement. A linear extrapolation of state-of-the-art LLM

⬆️ LLMs’ forecasting abilities are steadily improving.

GPT-4 (released March 2023) achieved a difficulty-adjusted Brier score of 0.131.

Nearly two years later, GPT-4.5 (released Feb 2025) scored 0.101—a substantial improvement.

A linear extrapolation of state-of-the-art LLM