Ilia Karmanov (@ikdeepl) 's Twitter Profile
Ilia Karmanov

@ikdeepl

Be nice to animals.
Research Scientist @nvidia

ID: 933009891387637760

linkhttps://ilkarman.github.io/ calendar_today21-11-2017 16:31:07

1,1K Tweet

305 Followers

808 Following

Philipp Schmid (@_philschmid) 's Twitter Profile Photo

SFT Memorizes, RL Generalizes. New Paper from Google DeepMind shows that Reinforcement Learning generalizes at cross-domain, while SFT primarily memorizes. rule-based tasks, while SFT memorizes the training rule. 👀 Experiments 1️⃣ Model & Tasks: Llama-3.2-Vision-11B;

SFT Memorizes, RL Generalizes. New Paper from <a href="/GoogleDeepMind/">Google DeepMind</a> shows that Reinforcement Learning generalizes at cross-domain, while SFT primarily memorizes.  rule-based tasks, while SFT memorizes the training rule.  👀

Experiments
1️⃣ Model &amp; Tasks: Llama-3.2-Vision-11B;
Tanishq Mathew Abraham, Ph.D. (@iscienceluvr) 's Twitter Profile Photo

Language Models Use Trigonometry to Do Addition "We first discover that numbers are represented in these LLMs as a generalized helix, which is strongly causally implicated for the tasks of addition and subtraction, and is also causally relevant for integer division,

Language Models Use Trigonometry to Do Addition

"We first discover that numbers are represented in these LLMs as a generalized helix, which is strongly causally implicated for the tasks of addition and subtraction, and is also causally relevant for integer division,
Jeff Dean (@jeffdean) 's Twitter Profile Photo

Delighted to be a minor co-author on this work, led by Pranav Nair: Combining losses for different Matyroshka-nested groups of bits in each weight within a neural network leads to an accuracy improvement for models, especially for low-bit-precision levels (e.g. 2-bit

Yash Bhalgat (@ysbhalgat) 's Twitter Profile Photo

After the FineWeb blog post, Hugging Face 🤗 has dropped another must-read: The Ultra-Scale Playbook – Training LLMs on GPU Clusters. They ran 4000+ experiments across 512 GPUs to break down the real challenges of scaling LLM training -- memory bottlenecks, compute efficiency,

After the FineWeb blog post, <a href="/huggingface/">Hugging Face</a> 🤗 has dropped another must-read: The Ultra-Scale Playbook – Training LLMs on GPU Clusters.

They ran 4000+ experiments across 512 GPUs to break down the real challenges of scaling LLM training -- memory bottlenecks, compute efficiency,
Angry Tom (@angrytomtweets) 's Twitter Profile Photo

RIP Sora. Alibaba just dropped Wan 2.1, and it's absolutely insane. This is the next evolution of AI video generation. Here are 10 mind-blowing features and examples: 1. A ferret entering water 🔊

swissinfo.ch (@swissinfo_en) 's Twitter Profile Photo

#Switzerland might be one of the worst places to be a working #woman. On #InternationalWomensDay, we look at a report by The Economist comparing working conditions for women across OECD countries. Read more about the #genderpaygap in Switzerland here 👉 buff.ly/L1X5G3k

Kevin Meng (@mengk20) 's Twitter Profile Photo

AI models are *not* solving problems the way we think using Docent, we find that Claude solves *broken* eval tasks - memorizing answers & hallucinating them! details in 🧵 we really need to look at our data harder, and it's time to rethink how we do evals...

AI models are *not* solving problems the way we think

using Docent, we find that Claude solves *broken* eval tasks - memorizing answers &amp; hallucinating them!

details in 🧵

we really need to look at our data harder, and it's time to rethink how we do evals...
Drew Pavlou 🇦🇺🇺🇦🇹🇼 (@drewpavlou) 's Twitter Profile Photo

The French overseas territory St. Pierre et Miquelon (population 5,800) now has the highest tariff rates in the world at 99% Their exports are valued at just $3.5 million dollars a year. My guess as to what happened here is that they likely export a tiny amount (like $100 k

The French overseas territory St. Pierre et Miquelon (population 5,800) now has the highest tariff rates in the world at 99%

Their exports are valued at just $3.5 million dollars a year. My guess as to what happened here is that they likely export a tiny amount (like $100 k
National Wildlife Federation (@nwf) 's Twitter Profile Photo

A Sunset to Remember ☀️🌊 “The paddleboarder portrays the peaceful coexistence of people and wildlife,” as captured at sunset in August 2020. 🥇 People in Nature | 📷 Renee Capozzola 2024 National Wildlife Photo Contest Winners 📲: ow.ly/k1FK50UAyEr

A Sunset to Remember ☀️🌊

“The paddleboarder portrays the peaceful coexistence of people and wildlife,” as captured at sunset in August 2020.

 🥇 People in Nature | 📷 Renee Capozzola

2024 National Wildlife Photo Contest Winners 📲: ow.ly/k1FK50UAyEr
Demis Hassabis (@demishassabis) 's Twitter Profile Photo

Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced version was able to solve 5 out of 6 problems. Incredible progress - huge congrats to Thang Luong and the team! deepmind.google/discover/blog/…

Grant Sanderson (@3blue1brown) 's Twitter Profile Photo

New video on the details of diffusion models: youtu.be/iv-5mZ_9CPY Produced by Welch Labs, this is the first in a small series of 3b1b this summer. I enjoyed providing editorial feedback throughout the last several months, and couldn't be happier with the result.

Syeda Nahida Akter (@snat02792153) 's Twitter Profile Photo

Most LLMs learn to think only after pretraining—via SFT or RL. But what if they could learn to think during it? 🤔 Introducing RLP: Reinforcement Learning Pre-training—a verifier-free objective that teaches models to “think before predicting.” 🔥 Result: Massive reasoning

Most LLMs learn to think only after pretraining—via SFT or RL. But what if they could learn to think during it? 🤔

Introducing RLP: Reinforcement Learning Pre-training—a verifier-free objective that teaches models to “think before predicting.”

🔥 Result: Massive reasoning