Grad (@grad62304977) Twitter Tweets • TwiCopy

Grad

@grad62304977

+ Follow

ID: 1313209072460627976

calendar_today05-10-2020 20:07:03

2,2K Tweet

3,3K Followers

1,1K Following

wh

@nrehiew_

24 days ago

Let's talk about the GLM 4.5 models. The latest frontier open weights model out of China (and possibly the best at the moment?) with quite a bit of details in the paper.

thumb_up_off_alt770

chat_bubble_outline9

repeat72

shareShare

⚡𝐅𝐏𝟖 makes RL faster — but at the cost of performance. We present 𝐅𝐥𝐚𝐬𝐡𝐑𝐋, the first 𝐨𝐩𝐞𝐧–𝐬𝐨𝐮𝐫𝐜𝐞 & 𝐰𝐨𝐫𝐤𝐢𝐧𝐠 𝐑𝐋 𝐫𝐞𝐜𝐢𝐩𝐞 that applies 𝐈𝐍𝐓𝟖/𝐅𝐏𝟖 for rollout 𝐰𝐢𝐭𝐡𝐨𝐮𝐭 𝐥𝐨𝐬𝐢𝐧𝐠 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞 compared to 𝐁𝐅𝟏𝟔! 📝 Blog:

thumb_up_off_alt548

chat_bubble_outline11

repeat84

shareShare

Mika Senghaas

@mikasenghaas

24 days ago

moving from vllm v0 to v1 made our async rl training crash! read how we fixed it we recently migrated from v0 to v1 as part of a larger refactor of prime-rl to make it easier-to-use, more performant and naturally async. we confirmed correct training dynamics on many

thumb_up_off_alt268

chat_bubble_outline7

repeat33

shareShare

Prime Intellect

@primeintellect

8 days ago

Introducing the Environments Hub RL environments are the key bottleneck to the next wave of AI progress, but big labs are locking them down We built a community platform for crowdsourcing open environments, so anyone can contribute to open-source AGI