Will Brenton (@wbrenton3) Twitter Tweets • TwiCopy

Bojan Tunguz

@tunguz

a year ago

San Francisco is so back.

thumb_up_off_alt359

chat_bubble_outline5

repeat21

shareShare

"Toward General Virtual Agents" I recently gave a talk at MIT. I argued that we should use tools from reinforcement learning and search to improve the capability and alignment of LLM agents. Slides: drive.google.com/file/d/1kDvmrm… Video:

thumb_up_off_alt505

chat_bubble_outline5

repeat80

shareShare

Petar Veličković

@petarv_93

a year ago

AlphaCode (powered by Gemini) is now roughly as capable as I am ("entry-level Division 1") on CodeForces -- a feat I did not expect to see this soon!

thumb_up_off_alt131

chat_bubble_outline0

repeat10

shareShare

Dimitris Papailiopoulos

@dimitrispapail

a year ago

Whoever tells you “we understand deep learning” just show them this. Fractals of the loss landscape as a function of hyperparameters even for small two layers nets. Incredible

thumb_up_off_alt2,2K

chat_bubble_outline51

repeat370

shareShare

Pieter Abbeel

@pabbeel

a year ago

Since AlexNet/ImageNet moment in October 2012, stock multiples: nvidia: ~250x amd: ~80x intel: ~2x

thumb_up_off_alt246

chat_bubble_outline8

repeat19

shareShare

Patrick Collison

@patrickc

a year ago

"We do these things not because they are easy, but because we thought they were going to be easy" is a surprisingly profound quote. When I ask people who've pulled off remarkable things, it's interesting how many confirm that they wouldn't have started if they'd know how long and

thumb_up_off_alt6,6K

chat_bubble_outline178

repeat735

shareShare

AK

@_akhaliq

a year ago

Meta announces Aria Everyday Activities Dataset present Aria Everyday Activities (AEA) Dataset, an egocentric multimodal open dataset recorded using Project Aria glasses. AEA contains 143 daily activity sequences recorded by multiple wearers in five geographically diverse indoor

thumb_up_off_alt469

chat_bubble_outline11

repeat91

shareShare

Samuel Sokota

@ssokota

10 months ago

SOTA AI for games like poker & Hanabi rely on search methods that don’t scale to games w/ large amounts of hidden information. In our ICLR paper, we introduce simple search methods that scale to large games & get SOTA for Hanabi w/ 100x less compute. 1/N arxiv.org/abs/2304.13138

thumb_up_off_alt335

chat_bubble_outline5

repeat52

shareShare

Costa Huang

@vwxyzjn

10 months ago

Happy to share our work on reproducing RLHF scaling behaviors in OpenAI's work in summarizing from feedback. We built an RLHF pipeline from scratch and enumerated over 20+ implementation details 🚀 Fun collab with Michael Noukhovitch @NeurIPS 2024, Arian Hosseini @ NeurIPS, Kashif Rasul, wang, and Lewis Tunstall 📜

Happy to share our work on reproducing RLHF scaling behaviors in <a href="/OpenAI/">OpenAI</a>'s work in summarizing from feedback. We built an RLHF pipeline from scratch and enumerated over 20+ implementation details 🚀

Fun collab with <a href="/mnoukhov/">Michael Noukhovitch @NeurIPS 2024</a>, <a href="/arianTBD/">Arian Hosseini @ NeurIPS</a>, <a href="/krasul/">Kashif Rasul</a>, <a href="/weixunwang/">wang</a>, and <a href="/_lewtun/">Lewis Tunstall</a>

📜

thumb_up_off_alt354

chat_bubble_outline7

repeat70

shareShare

Daniel Johnson

@_ddjohnson

9 months ago

Excited to share Penzai, a JAX research toolkit from Google DeepMind for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere. Check it out on GitHub: github.com/google-deepmin…

thumb_up_off_alt2,2K

chat_bubble_outline42

repeat421

shareShare

Will Brenton

@wbrenton3

6 months ago

And now we know reddit.com/r/reinforcemen…

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Noam Brown

@polynoamial

4 months ago

Deep RL is finally back out of the Trough of Disillusionment

thumb_up_off_alt615

chat_bubble_outline16

repeat48

shareShare

Rishabh Agarwal

@agarwl_

4 months ago

I gave my first guest lecture today in a grad course on LLMs as an (soon-to-be) adjunct prof at McGill. Putting the slides here, maybe useful to some folks ;) drive.google.com/file/d/1komQ7s…

thumb_up_off_alt928

chat_bubble_outline21

repeat117

shareShare

Jeff Clune

@jeffclune

4 months ago

Today feels like the future. "Her" arrives on my phone (from OpenAI), AR glasses prototypes are getting good (Meta), and humanity can produce this with a drone swarm. Let alone CRISPR, commercial space, etc. Not sure if it is good, but it's a wild ride. x.com/ShenzhenPages/…

thumb_up_off_alt39

chat_bubble_outline0

repeat1

shareShare

RL_Conference

@rl_conference

3 months ago

"In the Beginning, ML was RL". Andrew Barto gave RLC 2024 an amazing overview of the intertwined history of ML and RL (Link below)

thumb_up_off_alt411

chat_bubble_outline5

repeat69

shareShare

Will Brenton

@wbrenton3

3 months ago

Tech that looks like it’s from 2001 >

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Andrew Carr (e/🤸)

@andrew_n_carr

2 months ago

I know it seems like Amazon is eating Anthropic, with the funding and preferred compute partnership. However, I think the sub-title here of "deep technical collaboration" on "directly interfac[ing] with Trainium silicon" to improve the story of AWS chips could actually be immense

thumb_up_off_alt122

chat_bubble_outline9

repeat6

shareShare

Kenneth Stanley

@kenneth0stanley

2 months ago

How to know when you’re hitting a wall*: watch for when the benchmark rigor police start coming out in force. It happens in every epoch of AI research. Whenever the gains of a paradigm slow, the rigor police reemerge in the vain hope that the objective paradox can be thwarted.

thumb_up_off_alt43

chat_bubble_outline4

repeat5

shareShare

Sherjil Ozair

@sherjilozair

a month ago

Very happy to hear that GANs are getting the test of time award at NeurIPS 2024. The NeurIPS test of time awards are given to papers which have stood the test of the time for a decade. I took some time to reminisce how GANs came about and how AI has evolve in the last decade.

thumb_up_off_alt976

chat_bubble_outline19

repeat122

shareShare

Jeff Clune

@jeffclune

a month ago

In my 2019 AI-GA paper I proposed a neural net world model as a "Darwin Complete" environment search space that could produce any possible environment for open-ended learning. It felt like a flight of fancy. I knew rationally it was possible eventually, but emotionally it felt

thumb_up_off_alt101

chat_bubble_outline5

repeat20

shareShare

Will Brenton

Bojan Tunguz

Stephen McAleer

Petar Veličković

Dimitris Papailiopoulos

Pieter Abbeel

Patrick Collison

AK

Samuel Sokota

Costa Huang

Daniel Johnson

Will Brenton

Noam Brown

Rishabh Agarwal

Jeff Clune

RL_Conference

Will Brenton

Andrew Carr (e/🤸)

Kenneth Stanley

Sherjil Ozair

Jeff Clune