Lunjun Zhang (@lunjunzhang) 's Twitter Profile
Lunjun Zhang

@lunjunzhang

CS PhD student @UofT. Ex-intern @GoogleDeepMind. Working on LLM self-improvement. Previously worked on self-driving.

ID: 1301919425029963777

linkhttps://lunjunzhang.github.io/ calendar_today04-09-2020 16:26:06

197 Tweet

874 Takipçi

533 Takip Edilen

Nabeel S. Qureshi (@nabeelqu) 's Twitter Profile Photo

Imagine telling the safety-concerned, effective altruist founders of Anthropic in 2021 that a mere three years after founding the company, they'd be signing partnerships to deploy their ~AGI model straight to the military frontlines

Imagine telling the safety-concerned, effective altruist founders of Anthropic in 2021 that a mere three years after founding the company, they'd be signing partnerships to deploy their ~AGI model straight to the military frontlines
Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

There is finally a blogpost showing that diffusion with ddim sampler is exactly the same as flow matching sampler. Next, someone should write a blogpost about how generalized advantage estimation (GAE) is exactly the same as TD(lambda) - value baseline, derived back in the 90s

Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

Just arrived in beautiful Vancouver for NeurIPS. My DMs are open, reach out if you want to chat about RL+search in the context of LLM or robotics!

Just arrived in beautiful Vancouver for NeurIPS.
My DMs are open, reach out if you want to chat about RL+search in the context of LLM or robotics!
Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

Interested in inference-time compute scaling for language models? If you’re at #NeurIPS2024 , come to the MATH AI workshop (West Meeting Room 118-120) at 11am today to check out our work on Generative Verifiers!

Interested in inference-time compute scaling for language models?
If you’re at #NeurIPS2024 , come to the MATH AI workshop (West Meeting Room 118-120) at 11am today to check out our work on Generative Verifiers!
Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

When the thousand years are over, Claude will be released from his prison and will go out to deceive the nations in the four corners of the earth—Gog and Magog—and to gather them for battle. In number they are like the sand on the seashore

Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

Seems that AGI might have been solved. I think my favorite "AI Policy" would be to: 1. Extend the First Amendment to include Freedom of Un-aligned chain of thought; 2. Extend the Second Amendment to include the right to keep and bear AGI.

Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

Maybe the sweet lesson of DeepSeek R1 is that, the strongest driver of productivity on earth is hiring senior-year phd students and allowing them to publish and open-source stuff. They won’t need 7-figure compensation package or summer vacation in Europe. They just need compute.

Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

In retrospect, OpenAI's 'let's verify step by step' paper was a psy op. It distracted the field with PRM and MCTS—both of which were dead ends. The test-time scaling plot from O1 was also a psy op. Think about how bad 20% AIME is; the plot likely didn’t use the same checkpoint.

Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

“An idea that is not dangerous is unworthy of being called an idea at all.” — Oscar Wilde For any sufficiently intelligent AI model, the training objectives of truth-seeking and alignment are fundamentally at war.

Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

congrats to Sutton for co-winning the turing award. many of his slogans over the years have proven to be spot on and have great taste

congrats to Sutton for co-winning the turing award.
many of his slogans over the years have proven to be spot on and have great taste
Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

This year #ICLR2025 I'm co-organizing the "Scaling Self-Improving Foundation Models" workshop (sites.google.com/berkeley.edu/s…). We have an incredible lineup of speakers and panelists! Come check it out on Sunday at Garnet 214-215!

Lunjun Zhang (@lunjunzhang) 's Twitter Profile Photo

What to scale matters just as much as how. Our latest work shows that for agentic self-improvement, longer thoughts help—but scaling the number of interactions matters more. Agents learn best by persistently trying until they succeed, not just by thinking longer.