@kernelcdub : Here's how we're achieving R1 like reasoning with small models leveraging probabalistic inference-time scaling w/out using DeepSeek or derivatives. Results are impressive! MATH w/ Llama 8B approaches GPT-4o, and w/ Qwen2.5 Math 7B Instruct hits o1 level. red-hat-ai-innovation-team.github.io/posts/r1-like-… • TwiCopy

Chris Wright

@kernelcdub

+ Follow

Red Hat CTO. Tezos Foundation council member. Passion for open source SW innovation. Father and husband. Cyclist. Human.

ID: 583450546

calendar_today18-05-2012 04:09:51

1,1K Tweet

6,6K Followers

265 Following

Chris Wright

@kernelcdub

9 months ago

Here's how we're achieving R1 like reasoning with small models leveraging probabalistic inference-time scaling w/out using DeepSeek or derivatives. Results are impressive! MATH w/ Llama 8B approaches GPT-4o, and w/ Qwen2.5 Math 7B Instruct hits o1 level. red-hat-ai-innovation-team.github.io/posts/r1-like-…

thumb_up_off_alt28

chat_bubble_outline1

repeat9

shareShare