Ilya Sutskever(@ilyasut) 's Twitter Profileg
Ilya Sutskever

@ilyasut

towards a plurality of humanity loving AGIs @openai

ID:1720046887

calendar_today01-09-2013 19:32:15

1,1K Tweets

369,7K Followers

2 Following

OpenAI(@OpenAI) 's Twitter Profile Photo

We're announcing, together with Eric Schmidt: Superalignment Fast Grants.

$10M in grants for technical research on aligning superhuman AI systems, including weak-to-strong generalization, interpretability, scalable oversight, and more.

Apply by Feb 18! openai.com/blog/superalig…

account_circle
Leopold Aschenbrenner(@leopoldasch) 's Twitter Profile Photo

RLHF works great for today's models. But aligning future superhuman models will present fundamentally new challenges.

We need new approaches + scientific understanding.

New researchers can make enormous contributions—and we want to fund you!

Apply by Feb 18!

RLHF works great for today's models. But aligning future superhuman models will present fundamentally new challenges. We need new approaches + scientific understanding. New researchers can make enormous contributions—and we want to fund you! Apply by Feb 18!
account_circle
Boaz Barak(@boazbaraktcs) 's Twitter Profile Photo

My view is that what makes super-alignment 'super' is ensuring we can safely scale the capabilities of AIs even though we can't scale their human supervisors. For this, it is imperative to study the 'weak teacher strong student' setting. Paper shows great promise in this area!

account_circle
Sam Altman(@sama) 's Twitter Profile Photo

i'd particularly like to recognize Collin Burns for today's generalization result, who came to openai excited to pursue this vision and helped get the rest of the team excited about it!

account_circle
OpenAI(@OpenAI) 's Twitter Profile Photo

Large pretrained models have excellent raw capabilities—but can we elicit these fully with only weak supervision?

GPT-4 supervised by ~GPT-2 recovers performance close to GPT-3.5 supervised by humans—generalizing to solve even hard problems where the weak supervisor failed!

Large pretrained models have excellent raw capabilities—but can we elicit these fully with only weak supervision? GPT-4 supervised by ~GPT-2 recovers performance close to GPT-3.5 supervised by humans—generalizing to solve even hard problems where the weak supervisor failed!
account_circle
Leo Gao(@nabla_theta) 's Twitter Profile Photo

new paper! one reason aligning superintelligence is hard is because it will be different from current models, so doing useful empirical research today is hard. we fix one major disanalogy of previous empirical setups. I'm excited for future work making it even more analogous.

new paper! one reason aligning superintelligence is hard is because it will be different from current models, so doing useful empirical research today is hard. we fix one major disanalogy of previous empirical setups. I'm excited for future work making it even more analogous.
account_circle
Greg Brockman(@gdb) 's Twitter Profile Photo

New direction for AI alignment — weak-to-strong generalization.

Promising initial results: we used outputs from a weak model (fine-tuned GPT-2) to communicate a task to a stronger model (GPT-4), resulting in intermediate (GPT-3-level) performance.

account_circle
Pavel Izmailov(@Pavel_Izmailov) 's Twitter Profile Photo

Extremely excited to have this work out, the first paper from the Superalignment team! We study how large models can generalize from supervision of much weaker models.

twitter.com/OpenAI/status/…

account_circle
Jan Leike(@janleike) 's Twitter Profile Photo

Kudos especially to Collin Burns for being the visionary behind this work, Pavel Izmailov for all the great scientific inquisition, Ilya Sutskever for stoking the fires, Jan Hendrik Kirchner and Leopold Aschenbrenner for moving things forward every day. Amazing ✨

account_circle
Collin Burns(@CollinBurns4) 's Twitter Profile Photo

I’m extremely excited to finally share the first paper from the OpenAI Superalignment team :)

In it, we introduce a new research direction for aligning superhuman AI systems. 🧵

twitter.com/OpenAI/status/…

account_circle
Jan Leike(@janleike) 's Twitter Profile Photo

Super excited about our new research direction for aligning smarter-than-human AI:

We finetune large models to generalize from weak supervision—using small models instead of humans as weak supervisors.

Check out our new paper:
openai.com/research/weak-…

Super excited about our new research direction for aligning smarter-than-human AI: We finetune large models to generalize from weak supervision—using small models instead of humans as weak supervisors. Check out our new paper: openai.com/research/weak-…
account_circle
OpenAI(@OpenAI) 's Twitter Profile Photo

In the future, humans will need to supervise AI systems much smarter than them.

We study an analogy: small models supervising large models.

Read the Superalignment team's first paper showing progress on a new approach, weak-to-strong generalization: openai.com/research/weak-…

In the future, humans will need to supervise AI systems much smarter than them. We study an analogy: small models supervising large models. Read the Superalignment team's first paper showing progress on a new approach, weak-to-strong generalization: openai.com/research/weak-…
account_circle
OpenAI(@OpenAI) 's Twitter Profile Photo

Sam Altman is back as CEO, Mira Murati as CTO and Greg Brockman as President. OpenAI has a new initial board. Messages from Sam Altman and board chair Bret Taylor openai.com/blog/sam-altma…

account_circle
Sam Altman(@sama) 's Twitter Profile Photo

i love openai, and everything i’ve done over the past few days has been in service of keeping this team and its mission together. when i decided to join msft on sun evening, it was clear that was the best path for me and the team. with the new board and w satya’s support, i’m…

account_circle
Ilya Sutskever(@ilyasut) 's Twitter Profile Photo

I deeply regret my participation in the board's actions. I never intended to harm OpenAI. I love everything we've built together and I will do everything I can to reunite the company.

account_circle