Maggie Huan (@maggie_h2024) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Introducing 🌈 Rainbow Teaming, a new method for generating diverse adversarial prompts for LLMs via LLMs It's a versatile tool 🛠️ for diagnosing model vulnerabilities across domains and creating data to enhance robustness & safety 🦺 Co-lead w/ Sharath Raparthy & Andrei Lupu

thumb_up_off_alt179

chat_bubble_outline5

repeat44

shareShare

Aviral Kumar

@aviral_kumar2

a year ago

How can we train LLM Agents, to learn from their own experience autonomously? Introducing ArCHer, a simple (i.e., small change on top of standard RLHF) and effective way of doing so with multi-turn RL 🧵⬇️ Paper: arxiv.org/abs/2402.19446 Website: yifeizhou02.github.io/archer.io/

thumb_up_off_alt193

chat_bubble_outline2

repeat40

shareShare

Google DeepMind

@googledeepmind

a year ago

Introducing SIMA: the first generalist AI agent to follow natural-language instructions in a broad range of 3D virtual environments and video games. 🕹️ It can complete tasks similar to a human, and outperforms an agent trained in just one setting. 🧵 dpmd.ai/3TiYV7d

thumb_up_off_alt3,3K

chat_bubble_outline207

repeat833

shareShare

Corey Lynch

@coreylynch

a year ago

We are now having full conversations with Figure 01, thanks to our partnership with OpenAI. Our robot can: - describe its visual experience - plan future actions - reflect on its memory - explain its reasoning verbally Technical deep-dive 🧵:

thumb_up_off_alt2,2K

chat_bubble_outline144

repeat671

shareShare

Anca Dragan

@ancadianadragan

a year ago

So excited and so very humbled to be stepping in to head AI Safety and Alignment at Google DeepMind. Lots of work ahead, both for present-day issues and for extreme risks in anticipation of capabilities advancing.

thumb_up_off_alt583

chat_bubble_outline31

repeat38

shareShare

Joelle Pineau

@jpineau1

a year ago

I'm strongly supportive of this letter and its core message. We need a nuanced approach to the risks and benefits of AI, and more transparency is key to enable a wide group of stakeholders to join the conversation!

thumb_up_off_alt33

chat_bubble_outline0

repeat7

shareShare

Google DeepMind

@googledeepmind

a year ago

Soccer players have to master a range of dynamic skills, from turning and kicking to chasing a ball. How could robots do the same? ⚽ We trained our AI agents to demonstrate a range of agile behaviors using reinforcement learning. Here’s how. 🧵 dpmd.ai/3vUlgjC

thumb_up_off_alt2,2K

chat_bubble_outline128

repeat512

shareShare

Jason Ma

@jasonma2020

a year ago

Introducing DrEureka🎓, our latest effort pushing the frontier of robot learning using LLMs! DrEureka uses LLMs to automatically design reward functions and tune physics parameters to enable sim-to-real robot learning. DrEureka can propose effective sim-to-real configurations

thumb_up_off_alt594

chat_bubble_outline25

repeat115

shareShare

Maggie Huan

@maggie_h2024

a year ago

super exciting work! Congrats Bernal!

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Aran Komatsuzaki

@arankomatsuzaki

a year ago

Microsoft presents Self-Exploring Language Models: Active Preference Elicitation for Online Alignment SELM significantly boosts the performance on instructionfollowing benchmarks such as MT-Bench and AlpacaEval 2.0 repo: github.com/shenao-zhang/S… abs: arxiv.org/abs/2405.19332

thumb_up_off_alt228

chat_bubble_outline3

repeat49

shareShare

Lilian Weng

@lilianweng

a year ago

Rule-based rewards (RBRs) use model to provide RL signals based on a set of safety rubrics, making it easier to adapt to changing safety policies wo/ heavy dependency on human data. It also enables us to look at safety and capability in a more unified lens as a more capable

thumb_up_off_alt311

chat_bubble_outline13

repeat44

shareShare

Peter Stone

@peterstone_tx

a year ago

10 years after DQN, what are deep RL’s impacts on robotics? Which robotic problems have seen the most thrilling real-world successes thanks to DRL? Where do we still need to push the boundaries, and how? Our latest survey explores these questions! Read on for more details. 👇

thumb_up_off_alt512

chat_bubble_outline2

repeat99

shareShare

Furong Huang

@furongh

8 months ago

As I reflect on my journey as a faculty member over the past 7 years, I am overwhelmed with pride and gratitude. What started as a single-student-single-PI lab has blossomed into a vibrant group of almost 20 brilliant PhD students, along with numerous masters and undergraduate

thumb_up_off_alt128

chat_bubble_outline0

repeat10

shareShare

Stanford NLP Group

@stanfordnlp

4 months ago

For this week’s NLP Seminar, we are thrilled to host Nicholas Tomlin to talk about Reasoning with Language Models! When: 4/17 Thurs 11am PT Non-Stanford affiliates registration form: forms.gle/cxRmN3oovz8w7a…

For this week’s NLP Seminar, we are thrilled to host <a href="/NickATomlin/">Nicholas Tomlin</a> to talk about Reasoning with Language Models!

When: 4/17 Thurs 11am PT
Non-Stanford affiliates registration form: forms.gle/cxRmN3oovz8w7a…

thumb_up_off_alt125

chat_bubble_outline0

repeat14

shareShare

Nicholas Tomlin

@nickatomlin

3 months ago

The long-term goal of AI is to build models that can handle arbitrary tasks, not just ones they’ve been trained on. We hope our new *benchmark generator* can help measure progress toward this vision

thumb_up_off_alt181

chat_bubble_outline4

repeat30

shareShare

Ge Zhang

@gezhang86038849

3 months ago

[1/n] 🚀 Thrilled to unveil our latest breakthrough: AttentionInfluence! A groundbreaking, training-free, zero-supervision approach for selecting reasoning-rich pretraining data—just by masking attention heads! ✨ No labels. No retraining. A mere pretrained 1.3B-parameter model

thumb_up_off_alt234

chat_bubble_outline8

repeat48

shareShare

Maggie Huan

@maggie_h2024

a month ago

Learned a lot while working with Xiang, Yuetai, Tuney, Xiaoyu and all my collaborators. Wasn’t easy for me but super glad that all my collaborators are helping me out throughout. It’s cool to experience the magic of RL tuning on LLMs, and there are so much more to explore!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Maggie Huan

@maggie_h2024

25 days ago

Absolutely one of my favorite posts recently! Highly recommended

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Maggie Huan

Gate.io

Mikayel Samvelyan

Aviral Kumar

Google DeepMind

Corey Lynch

Anca Dragan

Joelle Pineau

Google DeepMind

Jason Ma

Maggie Huan

Aran Komatsuzaki

Lilian Weng

Peter Stone

Furong Huang

Stanford NLP Group

Nicholas Tomlin

Ge Zhang

Maggie Huan

Maggie Huan