Jakub Macina (@dmacjam) 's Twitter Profile
Jakub Macina

@dmacjam

AI/ML Scientist, mountain biker

ID: 1557692280

linkhttp://macina.sk calendar_today30-06-2013 10:06:20

101 Tweet

204 Takipçi

497 Takip Edilen

Jakub Macina (@dmacjam) 's Twitter Profile Photo

AI alignment for tutoring🎓 We use full online RL with conversation-level rewards—not just single-turn signals like DPO. Did the student actually learn by the end? Using GRPO, the model learns real teaching strategies like when to hint or when to correct. Explore models below⤵️