Shunyu Yao (@shunyuyao12) Twitter Tweets • TwiCopy

Tech is overestimated in the short term, (because infra is so damn harder than people realize) And underestimated in the long run. (becuase new tech becomes infra for new applications) Applies for computer, chip, internet, llm, rl, etc.

thumb_up_off_alt101

chat_bubble_outline4

repeat5

shareShare

Shunyu Yao

@shunyuyao12

7 months ago

Agent’s calibrated confidence is perhaps the most beautiful thing on earth

thumb_up_off_alt80

chat_bubble_outline11

repeat6

shareShare

Ben Shi

@benshi34

7 months ago

As we optimize model reasoning over verifiable objectives, how does this affect human understanding of said reasoning to achieve superior collaborative outcomes? In our new preprint, we investigate human-centric model reasoning for knowledge transfer 🧵:

thumb_up_off_alt177

chat_bubble_outline6

repeat39

shareShare

Andy Konwinski

@andykonwinski

6 months ago

Today, I’m launching a deeply personal project. I’m betting $100M that we can help computer scientists create more upside impact for humanity. Built for and by researchers, including Jeff Dean & Joelle Pineau on the board, Laude Institute catalyzes research with real-world impact.

thumb_up_off_alt1,1K

chat_bubble_outline48

repeat105

shareShare

Shunyu Yao

@shunyuyao12

6 months ago

An awesome piece by Kevin Lu . I find lots of the points connected to my own post ysymyth.github.io/The-Second-Hal… Pre-training is a genius idea that essentially leveraged billions of people, not just dozens in the lab. How can we leverage more people for rl?

thumb_up_off_alt100

chat_bubble_outline0

repeat4

shareShare

Sam Altman

@sama

6 months ago

watching chatgpt agent use a computer to do complex tasks has been a real "feel the agi" moment for me; something about seeing the computer think, plan, and execute hits different.

thumb_up_off_alt13,13K

chat_bubble_outline1,1K

repeat861

shareShare

Josh Tobin

@josh_tobin_

6 months ago

Introducing our latest agent: ChatGPT agent combines the best of deep research and operator into something that can do so much more for you. Try it out and let us know what you think!

thumb_up_off_alt134

chat_bubble_outline8

repeat16

shareShare

Shunyu Yao

@shunyuyao12

6 months ago

Excited to share what we’ve been working on!

thumb_up_off_alt176

chat_bubble_outline18

repeat11

shareShare

Alexander Wei

@alexwei_

6 months ago

1/N I’m excited to share that our latest OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

1/N I’m excited to share that our latest <a href="/OpenAI/">OpenAI</a> experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).

thumb_up_off_alt6,6K

chat_bubble_outline361

repeat1,1K

shareShare

Shunyu Yao

@shunyuyao12

5 months ago

How to evaluate llms when we can’t trust benchmark numbers anymore?

thumb_up_off_alt224

chat_bubble_outline53

repeat7

shareShare

OpenAI

@openai

5 months ago

GPT-5 is here. Rolling out to everyone starting today. openai.com/gpt-5/

thumb_up_off_alt30,30K

chat_bubble_outline2,2K

repeat6,6K

shareShare

Reiichiro Nakano

@reiinakano

5 months ago

capabilities-wise gpt5 seems within expectations with slightly better evals across the board. expect other frontier labs to catch up/jump ahead within the following weeks/month the real paradigm-shifting 4->5 leap is free users getting access to a frontier model by default.

thumb_up_off_alt22

chat_bubble_outline0

repeat1

shareShare