Richard Pang (@yzpang_) Twitter Tweets • TwiCopy

Richard Pang

@yzpang_

+ Follow

yzpang.me; @AIatMeta Llama research, prev: NYU, Meta FAIR, @uchicago, @googleai; research: llm, text gen, alignment, reasoning, human-lm collab, etc.

ID: 919679161903534081

linkhttp://yzpang.me calendar_today15-10-2017 21:39:33

96 Tweet

477 Followers

390 Following

Sumit

@_reachsumit

a year ago

Transformers Struggle to Learn to Search Demonstrates transformers can be taught to perform graph search tasks but increasingly struggle on larger graphs, with this difficulty not resolved by increased model scale. 📝arxiv.org/abs/2412.04703

thumb_up_off_alt189

chat_bubble_outline6

repeat28

shareShare

elvis

@omarsar0

a year ago

Transformers Struggle to Learn to Search Finds that transformer-based LLMs struggle to perform search robustly. Suggests that given the right training distribution, the transformer can learn to search. Also reports that performing search in-context exploration (i.e.,

thumb_up_off_alt267

chat_bubble_outline4

repeat51

shareShare

Mehran Kazemi

@kazemi_sm

a year ago

Search is a core operation for reasoning in large language models. Check out our new work where we dive deep into the ability of the transformer models to learn to search.

thumb_up_off_alt12

chat_bubble_outline0

repeat2

shareShare

Nitish Joshi

@nitishjoshi23

a year ago

New work where we show that with the right training distribution, transformers can learn to search and internally implement an exponential path-merging algo. But they struggle to learn to search as the graph size increases, and simple solns like scaling doesn't resolve it.

thumb_up_off_alt51

chat_bubble_outline1

repeat8

shareShare

Kevin K. Yang 楊凱筌

@kevinkaichuang

a year ago

Note: most MIT professors I know are honest and morally upright

thumb_up_off_alt1,1K

chat_bubble_outline27

repeat41

shareShare

Richard Pang

@yzpang_

8 months ago

First set of Llama 4!!

thumb_up_off_alt22

chat_bubble_outline0

repeat4

shareShare

AI at Meta

@aiatmeta

8 months ago

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model

thumb_up_off_alt13,13K

chat_bubble_outline706

repeat2,2K

shareShare

Ahmad Al-Dahle

@ahmad_al_dahle

8 months ago

We're glad to start getting Llama 4 in all your hands. We're already hearing lots of great results people are getting with these models. That said, we're also hearing some reports of mixed quality across different services. Since we dropped the models as soon as they were

thumb_up_off_alt1,1K

chat_bubble_outline86

repeat84

shareShare

Archiki Prasad

@archikiprasad

7 months ago

🎉 Excited to share that my internship work, ScPO, on self-training LLMs to improve reasoning without human labels, has been accepted to #ICML2025! Many thanks to my awesome collaborators at AI at Meta and @uncnlp🌞Looking forward to presenting ScPO in Vancouver 🇨🇦

thumb_up_off_alt243

chat_bubble_outline9

repeat35

shareShare

Jason Weston

@jaseweston

5 months ago

We worked on a whole line of research on this: - Self-Rewarding LMs (use self as a Judge in semi-online DPO): arxiv.org/abs/2401.10020 - Thinking LLMs (learn CoTs with a Judge with semi-online DPO): arxiv.org/abs/2410.10630 *poster at ICML this week!!* - Mix verifiable &

thumb_up_off_alt200

chat_bubble_outline0

repeat23

shareShare

Archiki Prasad

@archikiprasad

5 months ago

I’ll be at #ICML2025 this week to present ScPO: 📌 Wednesday, July 16th, 11:00 AM-1:30 PM 📍East Exhibition Hall A-B, E-2404 Stop by or reach out to chat about improving reasoning in LLMs, self-training, or just tips about being on the job market next cycle! 😃

thumb_up_off_alt93

chat_bubble_outline0

repeat18

shareShare

Richard Pang

@yzpang_

2 months ago

🚨Prompt Curriculum Learning (PCL) - Efficient LLM RL training algo! - We investigate factors that affect convergence: bsz, # prompt, # gen, prompt selection - We propose PCL: lightweight algo that *dynamically selects intermediate-difficulty prompts* using a learned value model

thumb_up_off_alt171

chat_bubble_outline2

repeat37

shareShare

Joongwon Kim

@danieljwkim

2 months ago

Excited to share Prompt Curriculum Learning (PCL) from AI at Meta - we improve performance-efficiency tradeoffs for reasoning RL by predicting prompt difficulty with a value model updated on-policy, and selecting intermediate-difficulty prompts that yield high effective ratios.

thumb_up_off_alt60

chat_bubble_outline1

repeat14

shareShare

Nicholas Lourie

@nicklourie

2 months ago

LLMs are expensive—experiments cost a lot, mistakes even more. How do you make experiments cheap and reliable? By using hyperparameters' empirical structure. Kyunghyun Cho, He He, and I show you how in Hyperparameter Loss Surfaces Are Simple Near their Optima at #COLM2025! 🧵1/9

thumb_up_off_alt23

chat_bubble_outline2

repeat8

shareShare