Minae Kwon (@minaekwon) Twitter Tweets • TwiCopy

Chris Cundy

3 years ago

Introducing *SequenceMatch*, training LLMs with an imitation learning loss Avoids compounding error in generation by: 1. Training against *different divergences* like χ^2 with more support OOD 2. Adding a *backspace* action: model can correct errors! arxiv.org/abs/2306.05426 1/7

thumb_up_off_alt484

chat_bubble_outline5

repeat86

shareShare

Vivek Myers

@vivek_myers

3 years ago

Check out our new paper! rail-berkeley.github.io/grif/

thumb_up_off_alt13

chat_bubble_outline0

repeat3

shareShare

Tri Dao

@tri_dao

3 years ago

Announcing FlashAttention-2! We released FlashAttention a year ago, making attn 2-4 faster and is now widely used in most LLM libraries. Recently I’ve been working on the next version: 2x faster than v1, 5-9x vs standard attn, reaching 225 TFLOPs/s training speed on A100. 1/

thumb_up_off_alt3,3K

chat_bubble_outline39

repeat665

shareShare

Yuchen Cui

@yuchencui1

3 years ago

We use gestures all the time for specifying targets! How can robots make sense of “gimme that one”? We propose GIRAF, a framework for interpreting human gesture instructions using LLMs. Paper to appear in Conference on Robot Learning: arxiv.org/abs/2309.02721 Website: tinyurl.com/giraf23

thumb_up_off_alt82

chat_bubble_outline7

repeat15

shareShare

Priya Sundaresan

@priyasun_

3 years ago

Hungry? Let our robot twirl your spaghetti for you! 🍝🤖 Introducing VAPORS: Visual Action Planning OveR Sequences, a framework for long-horizon food acquisition. Project Page: sites.google.com/view/vaporsbot Paper: arxiv.org/abs/2309.05197 To appear at Conference on Robot Learning 1/11🧵

thumb_up_off_alt138

chat_bubble_outline1

repeat29

shareShare

Sang Michael Xie

@sangmichaelxie

3 years ago

Releasing an open-source PyTorch implementation of DoReMi! github.com/sangmichaelxie… The pretraining data mixture is a secret sauce of LLM training. Optimizing your data mixture for robust learning with DoReMi can reduce training time by 2-3x. Train smarter, not longer!

thumb_up_off_alt262

chat_bubble_outline2

repeat65

shareShare

shreya rajpal

@shreyar

3 years ago

It's an absolute honor to be a guest on the The TWIML AI Podcast podcast! Sam Charrington and I cover everything under the sun in LLMOps, from hallucinations, RAG to LLM safety. Check out the podcast on the link below!

thumb_up_off_alt59

chat_bubble_outline2

repeat6

shareShare

Foundation Models, LLMs, and Game Theory Workshop

@fm_llms_gt

3 years ago

We are excited to announce the first workshop on Foundation Models, Large Language Models (LLMs), and Game Theory! The workshop will take place at the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS) on October 19-20, 2023. dimacs.rutgers.edu/events/details…

thumb_up_off_alt13

chat_bubble_outline1

repeat7

shareShare

Foundation Models, LLMs, and Game Theory Workshop

@fm_llms_gt

3 years ago

We also call on researchers with ongoing research to submit posters to our workshop. The workshop will provide financial support to a limited number of researchers. Applications for financial support will remain open until September 23, 2023. docs.google.com/forms/d/e/1FAI…

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

Andy Shih

@andyshih_

3 years ago

Excited about recent improvements of our NeurIPS Spotlight paper, now even faster with ⚡️⚡️multiprocessing⚡️⚡️! We now get 2x speedup on as low as 50-step DDIM, and 4x speedup on 200-step DDIM! The first version of our paper showed good results, but we wanted even better. -

thumb_up_off_alt71

chat_bubble_outline1

repeat11

shareShare

Alex Tamkin

@alextamkin

3 years ago

Eliciting Human Preferences with Language Models Currently, people write detailed prompts to describe what they want a language model to do We explore *generative elicitation*—where models interactively ask for this information through open-ended conversation 1/

thumb_up_off_alt450

chat_bubble_outline4

repeat83

shareShare

Hugh Zhang

@hughbzhang

2 years ago

I also have no idea what Q* is, but given speculation that it’s a method of self-learning and Monte-Carlo Tree Search (MCTS) in language models, I thought I’d share some recent work on an adjacent idea.

thumb_up_off_alt401

chat_bubble_outline10

repeat37

shareShare

Jesse Mu

@jayelmnop

2 years ago

Achievement unlocked ✅, thanks for the shout-out Andrej Karpathy!

Achievement unlocked ✅, thanks for the shout-out <a href="/karpathy/">Andrej Karpathy</a>!

thumb_up_off_alt246

chat_bubble_outline2

repeat7

shareShare

Jesse Mu

@jayelmnop

2 years ago

We’re hiring for the adversarial robustness team Anthropic! As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)

We’re hiring for the adversarial robustness team <a href="/AnthropicAI/">Anthropic</a>!

As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)

thumb_up_off_alt456

chat_bubble_outline4

repeat72

shareShare

Alex Tamkin

@alextamkin

2 years ago

Made a short video exploring tool use and subagents! (w/ @aaron_begg and everett) Goal: Find the “quickest quicksort” implementation on GitHub by having a larger model orchestrate 100 subagent models Here’s how it works: 1/ x.com/AnthropicAI/st…

thumb_up_off_alt92

chat_bubble_outline1

repeat6

shareShare

Dorsa Sadigh

@dorsasadigh

2 years ago

At #ICRA24 we've a few papers on 𝗴𝗿𝗼𝘂𝗻𝗱𝗲𝗱 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 of LLMs/VLMs. • Grounded common-sense reasoning via active perception - Minae Kwon's 🧵👇 • Physically grounding VLMs - Jensen Gao's 🧵👇 • Learning from online language corrections - @lihanzha's 🧵👇

thumb_up_off_alt82

chat_bubble_outline1

repeat5

shareShare

Anthropic

@anthropicai

2 years ago

Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free: claude.ai

thumb_up_off_alt7,7K

chat_bubble_outline424

repeat1,1K

shareShare

Ethan Perez

@ethanjperez

2 years ago

I’m taking applications for collaborators via ML Alignment & Theory Scholars! It’s a great way for new or experienced researchers outside AI safety research labs to work with me/others in these groups: Neel Nanda, Evan Hubinger, mrinank 🍂, Nina, Fabien Roger, Rylan Schaeffer, ...🧵

thumb_up_off_alt150

chat_bubble_outline3

repeat30

shareShare

Anthropic

@anthropicai

2 years ago

Introducing an upgraded Claude 3.5 Sonnet, and a new model, Claude 3.5 Haiku. We’re also introducing a new capability in beta: computer use. Developers can now direct Claude to use computers the way people do—by looking at a screen, moving a cursor, clicking, and typing text.

thumb_up_off_alt10,10K

chat_bubble_outline484

repeat1,1K

shareShare

Anthropic

@anthropicai

3 months ago

We’re publishing a new constitution for Claude. The constitution is a detailed description of our vision for Claude’s behavior and values. It’s written primarily for Claude, and used directly in our training process. anthropic.com/news/claude-ne…

thumb_up_off_alt6,6K

chat_bubble_outline447

repeat835

shareShare