Joan Cabezas
@josancamon19
Co-founder cifrato.ai (YC W25) | prev built @omedotme, tripplanner.ai
ID: 354608817
14-08-2011 00:52:33
179 Tweet
203 Followers
240 Following
next steps to get this right: 1. explore more complex (e.g. tool calling) RL behaviors, ditch gsm8k. 2. qwen contamination issues, use Gemma 1B 4B 12B 27B -pt. 3. use David Hall marin checkpoints on 8B to figure task X% = a*C-pt + b*C-RL, (a, b being the optimal ratios to %).
interpretability folks should spend some time hiring cracked frontend engineers, tools available look like software from the 90's, what if it just looked like an interactive fMRI? so many ways to make this cool, cc Neel Nanda
KGGen now has a way to visually navigate generated knowledge graphs: with Stanford Trustworthy AI Research (STAIR) Lab