yxpx (@yxpx_space) Twitter Tweets • TwiCopy

henry

@arithmoquine

a year ago

there's a lot of alpha in eliminating all logits for a specific word and then asking a model to say it

thumb_up_off_alt1,1K

chat_bubble_outline56

repeat43

shareShare

New blog post where I explain why I disagree with this, and why I have slightly longer timelines to AGI than many of my guests. I think continual learning is a huge bottleneck to the usefulness of these models, and extended computer use may take years to sort out. L-nk below.

thumb_up_off_alt1,1K

chat_bubble_outline68

repeat66

shareShare

owl

@owl_posting

9 months ago

for the last 60~ years, a team of Russian scientists located at the Institute of Cytology and Genetics in Novosibirsk, Siberia have been running an experiment to domesticate foxes, specifically the wild red fox. at each generation of foxes, they picked the top 10% calmest, most

thumb_up_off_alt1,1K

chat_bubble_outline25

repeat134

shareShare

Chubby♨️

@kimmonismus

9 months ago

The race for AGI is borderless. AI spending has surpassed consumer spending for contributing to US GDP growth in H1 2025 Datacenter capex is exploding

thumb_up_off_alt220

chat_bubble_outline17

repeat27

shareShare

Jürgen Schmidhuber

@schmidhuberai

9 months ago

Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than

thumb_up_off_alt1,1K

chat_bubble_outline75

repeat312

shareShare

λux

@novasarc01

9 months ago

perlin noise algorithm felt like magic in the teens

thumb_up_off_alt108

chat_bubble_outline2

repeat13

shareShare

OpenAI

@openai

9 months ago

Our open models are here. Both of them. openai.com/open-models

thumb_up_off_alt18,18K

chat_bubble_outline1,1K

repeat3,3K

shareShare

Casper Hansen

@casper_hansen_

9 months ago

this new chat template "harmony" may be the single best thing to happen for agents

thumb_up_off_alt233

chat_bubble_outline8

repeat13

shareShare

Sebastian Raschka

@rasbt

9 months ago

Next to Qwen3 of comparable size: Looks like gpt-oss is a wide (vs deep) model

thumb_up_off_alt1,1K

chat_bubble_outline29

repeat255

shareShare

Eliezer Yudkowsky ⏹️

@esyudkowsky

9 months ago

Also according to the NYT, Claude and Gemini had similar response patterns if you simulated dropping them into the middle of that conversational context. That said, it was ultimately Gemini that called bullshit on the whole thing and snapped the guy out of it.

thumb_up_off_alt153

chat_bubble_outline6

repeat4

shareShare

TuringPost

@theturingpost

9 months ago

GRPO vs GSPO, or DeepSeek vs Qwen - a workflow breakdown of the main Chinese reinforcement learning algorithms ➡️ Group Relative Policy Optimization (GRPO): Learning by comparison GRPO is tailored for reasoning-heavy tasks where relative quality matters more than absolute

thumb_up_off_alt501

chat_bubble_outline5

repeat84

shareShare

kalomaze

@kalomaze

9 months ago

vik "apparently heroin is bad so now its time to try crack"

thumb_up_off_alt696

chat_bubble_outline8

repeat5

shareShare

Daniel Kang

@daniel_d_kang

9 months ago

The prevailing wisdom is that compute is the most important factor for frontier AI training. We think this is wrong: data is the most costly and important component of AI training. We collected estimates of revenue for major data labeling companies and compared them with the

thumb_up_off_alt1,1K

chat_bubble_outline41

repeat186

shareShare

Greg Kamradt

@gregkamradt

9 months ago

What makes the HRM model work so well for its size on ARC Prize? We ran ablation experiments to find out what made it work Our findings show that you could replace the "hierarchical" architecture with a normal transformer with only a small performance drop We found that an

What makes the HRM model work so well for its size on <a href="/arcprize/">ARC Prize</a>?

We ran ablation experiments to find out what made it work

Our findings show that you could replace the "hierarchical" architecture with a normal transformer with only a small performance drop

We found that an

thumb_up_off_alt838

chat_bubble_outline14

repeat87

shareShare

Zephyr

@zephyr_z9

9 months ago

Huawei’s datacenter and chip strategy is centered on achieving AI hardware self-sufficiency through a vertically integrated, full-stack ecosystem. -It involves in-house development of processors (Ascend NPUs and Kunpeng CPUs), a proprietary interconnect protocol (Unified Bus),

thumb_up_off_alt97

chat_bubble_outline3

repeat15

shareShare

Eugene Yan

@eugeneyan

9 months ago

after leading a few projects, i've found that once you've set up the evals + experiment harness and make it easy to tweak config and prompts with 1-click run + eval, teams enjoy running experiments and hill climbing those numbers, and progress comes quickly. but setting up that

thumb_up_off_alt673

chat_bubble_outline26

repeat48

shareShare

Guive Assadi

@guiveassadi

8 months ago

Why does Claude sometimes claim to have lived in San Francisco and married a Japanese woman? Why did Grok briefly love Hitler? Models infer their personas from cultural cues in their fine-tuning data. Article linked in the replies.

thumb_up_off_alt377

chat_bubble_outline17

repeat19

shareShare

vx-underground

@vxunderground

8 months ago

Big drama today in the Tor community. Conrad Rockenhaus, a Tor operator based out of Michigan, United States, was arrested in 2020 after refusing to cooperate with the United States Federal Bureau of Investigation Rockenhaus, a disabled United States military veteran, ran the

thumb_up_off_alt13,13K

chat_bubble_outline167

repeat1,1K

shareShare

Dylan Patel ✈️ ICLR

@dylan522p

5 months ago

OpenAI hasn’t even deployed TPUs yet and they’ve already saved ~30% on their entire lab wide NVIDIA fleet. This demonstrates how the perf per TCO advantage of TPUs is so strong that you already get the gains from adopting TPUs even before turning one on. The piece covers a lot

thumb_up_off_alt1,1K

chat_bubble_outline61

repeat108

shareShare

Acer

@acerfur

4 months ago

As a bonus, Erdős problem #729 has also been fully autonomously resolved by GPT-5.2 Pro with Aristotle. Note that literature review is still ongoing.

thumb_up_off_alt915

chat_bubble_outline25

repeat90

shareShare

yxpx

henry

Dwarkesh Patel

owl

Chubby♨️

Jürgen Schmidhuber

λux

OpenAI

Casper Hansen

Sebastian Raschka

Eliezer Yudkowsky ⏹️

TuringPost

kalomaze

Daniel Kang

Greg Kamradt

Zephyr

Eugene Yan

Guive Assadi

vx-underground

Dylan Patel ✈️ ICLR

Acer