echo.hive (@hive_echo) 's Twitter Profile
echo.hive

@hive_echo

🟣1000x Cursor course: tinyurl.com/LearnCursor 🟢 I learn and share my knowledge: echohive.live 🔴 Open Source: github.com/echohive42

ID: 1553828550909648896

linkhttps://www.echohive.live calendar_today31-07-2022 19:43:25

9,9K Tweet

10,10K Takipçi

722 Takip Edilen

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

New NVIDIA paper makes models think before predicting, training this behavior during pretraining for stronger reasoning. The novelty is that it makes base models practice reasoning during pretraining, not just after. The reward needs no verifier and appears at every token, so

New <a href="/nvidia/">NVIDIA</a> paper makes models think before predicting, training this behavior during pretraining for stronger reasoning. 

The novelty is that it makes base models practice reasoning during pretraining, not just after.

The reward needs no verifier and appears at every token, so
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

A beautiful paper from MIT+Harvard+ Google DeepMind 👏 Explains why Transformers miss multi digit multiplication and shows a simple bias that fixes it. The researchers trained two small Transformer models on 4-digit-by-4-digit multiplication. One used a special training method

A beautiful paper from MIT+Harvard+ <a href="/GoogleDeepMind/">Google DeepMind</a> 👏

Explains why Transformers miss multi digit multiplication and shows a simple bias that fixes it.

The researchers trained two small Transformer models on 4-digit-by-4-digit multiplication.

One used a special training method
Bartosz Naskręcki (@nasqret) 's Twitter Profile Photo

GPT-5-Pro solved, in just 15 minutes (without any internet search), the presentation problem known as “Yu Tsumura’s 554th Problem.” arxiv.org/pdf/2508.03685 This is the first model to solve this task completely. I expect more such results soon — the model demonstrates a strong

GPT-5-Pro solved, in just 15 minutes (without any internet search), the presentation problem known as “Yu Tsumura’s 554th Problem.”

arxiv.org/pdf/2508.03685

This is the first model to solve this task completely. I expect more such results soon — the model demonstrates a strong
echo.hive (@hive_echo) 's Twitter Profile Photo

I started using gpt-5 “non thinking” more and more when studying. Non thinking is quite pleasant when explaining things unlike the thinking version Grok 4 fast is still amazing but gpt-5 non thinking is way faster and also executes code faster and does better visualizations it

echo.hive (@hive_echo) 's Twitter Profile Photo

This new cheetah model is very fast and very good too! and it is $10 per mil output tokens so it cant exactly be a super small model right? inference time active param optimizations must have reached wild levels no?

echo.hive (@hive_echo) 's Twitter Profile Photo

Really loving these fast coding models. Totally puts you in flow But then you run into their limits very quickly If these models get 10x better and even faster, coding then will be an absolutely incredible experience!

echo.hive (@hive_echo) 's Twitter Profile Photo

Which is better coding model all things considered ( speed, quality etc ) not just quality Please give only if you have tried all and having considered a cumulative point of view not just quality

François Chollet (@fchollet) 's Twitter Profile Photo

You can teach a Transformer to execute a simple algorithm if you provide the exact step by step algorithm during training via CoT tokens. This is interesting, but the point of machine learning should be to *find* the algorithm during training, from input/output pairs only -- not

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

🫡 GPT-5-Pro just solved, the Math problem that no other LLM could solve. Took 14 minutes without any internet search. An Oxford and Cambridge paper claimed that no LLM could solve ‘Yu Tsumura’s 554th Problem’.” OpenAI's GPT‑5 Pro produced a full proof in about ~14 minutes.

🫡 GPT-5-Pro just solved, the Math problem that no other LLM could solve. Took 14 minutes without any internet search.

An Oxford and Cambridge paper claimed that no LLM could solve ‘Yu Tsumura’s 554th Problem’.”

OpenAI's GPT‑5 Pro produced a full proof in about ~14 minutes.
echo.hive (@hive_echo) 's Twitter Profile Photo

I am fairly certain that Math takes that sharp turn into the land of deep abstractions starting with first-order linear differential equations. This was the first time that I had to rewatch an entire chapter. Hey Grok would I you agree?

Gal Zajc (@zajcgal) 's Twitter Profile Photo

wow! quite impressive. a small channel uploading some insane builds for 15 years already youtube.com/watch?v=ubq5yV…

Haider. (@slow_developer) 's Twitter Profile Photo

now this is big... GPT-5-based agentic frameworks have reached 70% on the OSWorld benchmark — a real-computer, cross-OS environment for multimodal agents and that score is close to the 72% human mark. this can only suggest that human-level computer use is now within reach

now this is big...

GPT-5-based agentic frameworks have reached 70% on the OSWorld benchmark — a real-computer, cross-OS environment for multimodal agents

and that score is close to the 72% human mark.

this can only suggest that human-level computer use is now within reach
echo.hive (@hive_echo) 's Twitter Profile Photo

Yesterday was the first time I felt I had forgotten everything I learned the day before 🤔 I assume this is related to the level of jump in abstraction? And my mind not being used to it? Or that I have overloaded my brain with 6-7 hours of math everyday non stop 😆 But it all

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

New NVIDIA paper shows how to make text to image models render high resolution images far faster without losing quality. 53x faster 4K on H100, 3.5 seconds on a 5090 with quantization for 138x total speedup. It speeds up by moving generation into a smaller hidden image space.

New <a href="/nvidia/">NVIDIA</a> paper shows how to make text to image models render high resolution images far faster without losing quality.

53x faster 4K on H100, 3.5 seconds on a 5090 with quantization for 138x total speedup.

It speeds up by moving generation into a smaller hidden image space.
echo.hive (@hive_echo) 's Twitter Profile Photo

How many moments have we had with AI already? -chatgpt moment -4o moment -realtime voice moment -sora moment -deepseek moment -o1 moment -image-gen-1 moment -veo3 moment -nanobanana moment -sora moment again … tomorrow we will probably have another moment maybe