DCSC91
@dcsc_91
reward function author
ID: 117537381
25-02-2010 21:33:42
3,3K Tweet
1,1K Followers
1,1K Following
You can now train Vision LLMs with Reinforcement Learning in our free notebook! Unsloth VLM RL via GRPO: 1.5× faster, 90% less VRAM, 15× longer context & no accuracy loss. Guide: docs.unsloth.ai/new/vision-rei… GitHub: github.com/unslothai/unsl… Qwen2.5-VL Colab: colab.research.google.com/github/unsloth…