George Grigorev (@iamgrigorev) Twitter Tweets • TwiCopy

Logan Kilpatrick

@officiallogank

3 months ago

The progress of Gemini over the last year +

thumb_up_off_alt1,1K

chat_bubble_outline153

repeat139

shareShare

it's interesting that 80% of people are kind of locked in 5 apps on their phone: maps X youtube messenger mail. I wonder if it would be possible to optimize this form factor, such as we see transformer architecture embedded onto a chip (Groq, SambaNova, Etched)

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Andre Saraiva

@andresnds

2 months ago

1/N Yesterday in Tokyo we OpenAI ran a 10‑hour live Humans vs AI exhibition at the AtCoder World Tour Finals Heuristic. We pointed an OpenAI reasoning model at the same brutal problem the finalists tackled—no human help, same rules, same clock. Buckle up. 👇

1/N Yesterday in Tokyo we <a href="/OpenAI/">OpenAI</a> ran a 10‑hour live Humans vs AI exhibition at the AtCoder World Tour Finals Heuristic. We pointed an OpenAI reasoning model at the same brutal problem the finalists tackled—no human help, same rules, same clock. Buckle up. 👇

thumb_up_off_alt550

chat_bubble_outline15

repeat90

shareShare

George Grigorev

@iamgrigorev

2 months ago

I’m sure there has been some significant progress in humanoid robots already (in China and in the US) but there’s no leader in the field. And probably currently there is a lot of secrecy. When such one successful product appears in the market (aka chatgpt moment) even if fully

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

George Grigorev

@iamgrigorev

2 months ago

TIL you can actually buy new M1 MacBook Air for 400 pounds, and I heard in US it’s even 350 usd.. crazy

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Tilde

@tilderesearch

a month ago

Mixture‑of‑Experts (MoE) powers many frontier models like R1, K2, & Qwen3 ⚡️ To make frontier-scale MoE models accessible to train, we open-source MoMoE, a hyper-performant MoE implementation built for training and inference, outpacing the fastest existing ones by up to: - 70%

thumb_up_off_alt334

chat_bubble_outline2

repeat39

shareShare

Seunghyun Seo

@seunghyunseo7

a month ago

x.com/shxf0072/statu… Joey (e/λ) i guess rewritten dataset helps a lot

x.com/shxf0072/statu…
<a href="/shxf0072/">Joey (e/λ)</a> i guess rewritten dataset helps a lot

thumb_up_off_alt8

chat_bubble_outline1

repeat1

shareShare

George Grigorev

@iamgrigorev

a month ago

Some interesting insights about BPE tokenization during inference – especially if you’re trying to reuse training-time logic. * 1. We have pre-defined set of merges and we just want to apply them in order to a set of pre-tokens. 2. Sequence of pre-tokens is no longer represented

thumb_up_off_alt2

chat_bubble_outline2

repeat0

shareShare

Jiayi Weng

@trinkle23897

a month ago

Harmony format is finally open-sourced. I still remember 3 years ago (before ChatGPT release) Shengjia Zhao, Daniel and I were brainstorming about the right abstraction for RL training, and that is the start point of the entire harmony library. github.com/openai/harmony

thumb_up_off_alt1,1K

chat_bubble_outline27

repeat125

shareShare

George Grigorev

@iamgrigorev

a month ago

interesting way to make open-source model safe - nerf pre-training data and check during RL training to answer unsafe prompts + unsafe data. If quality still lower than already released o3 - you're good to go. It's also a cool marketing trick -- they say that even after

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

George Grigorev

@iamgrigorev

a month ago

this is a very nice trolling

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

George Grigorev

@iamgrigorev

a month ago

OpenAI doesn't disclose amount of data used nor amount of gpus, but we could estimate! We know that they used 2.1M H100 hours. Considering that sama told that they re-trained gpt-oss at least once since it didn't suffice their needs, I would expect the training time took 15-30

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare