
Ξloise
@eloisechou
Formerly @d1ventures & @ChainNewscom (LeftOfCenter) | Ex @splitworks
ID: 313717396
09-06-2011 03:11:47
1,1K Tweet
1,1K Takipçi
5,5K Takip Edilen



Train your own reasoning LLM using DeepSeek's GRPO algorithm with our free notebook! You'll transform Llama 3.1 (8B) to have chain-of-thought. Unsloth makes GRPO use 80% less VRAM. Guide: docs.unsloth.ai/basics/reasoni… GitHub: github.com/unslothai/unsl… Colab: colab.research.google.com/github/unsloth…






