Trelis Research (@trelisresearch) 's Twitter Profile
Trelis Research

@trelisresearch

👷Work for Trelis: trelis.com/developer-coll…
🎥 Watch on Youtube: youtube.com/@trelisresearch
💡 Book a Consultation: forms.gle/2VXzrB

ID: 1667096163902685187

linkhttps://trelis.com calendar_today09-06-2023 09:08:58

1,1K Tweet

1,1K Followers

433 Following

Trelis Research (@trelisresearch) 's Twitter Profile Photo

+ GRPO is Poor and for the GPU-Rich + ------------------------------- *A specific GRPO vs SFT video will be out next week, but I'm putting initial results here* I trained Llama 3.2 1B on GSM8K with: 1. SFT 2. ORPO 3. GRPO For SFT and ORPO, I generated training data using Llama

+ GRPO is Poor and for the GPU-Rich +
-------------------------------

*A specific GRPO vs SFT video will be out next week, but I'm putting initial results here*

I trained Llama 3.2 1B on GSM8K with:
1. SFT
2. ORPO
3. GRPO

For SFT and ORPO, I generated training data using Llama