yxpx (@yxpx_space) 's Twitter Profile
yxpx

@yxpx_space

ID: 1855629911412686848

calendar_today10-11-2024 15:14:06

103 Tweet

137 Followers

627 Following

Dwarkesh Patel (@dwarkesh_sp) 's Twitter Profile Photo

New blog post where I explain why I disagree with this, and why I have slightly longer timelines to AGI than many of my guests. I think continual learning is a huge bottleneck to the usefulness of these models, and extended computer use may take years to sort out. L-nk below.

New blog post where I explain why I disagree with this, and why I have slightly longer timelines to AGI than many of my guests.

I think continual learning is a huge bottleneck to the usefulness of these models, and extended computer use may take years to sort out.

L-nk below.
owl (@owl_posting) 's Twitter Profile Photo

for the last 60~ years, a team of Russian scientists located at the Institute of Cytology and Genetics in Novosibirsk, Siberia have been running an experiment to domesticate foxes, specifically the wild red fox. at each generation of foxes, they picked the top 10% calmest, most

Chubby♨️ (@kimmonismus) 's Twitter Profile Photo

The race for AGI is borderless. AI spending has surpassed consumer spending for contributing to US GDP growth in H1 2025 Datacenter capex is exploding

The race for AGI is borderless.

AI spending has surpassed consumer spending for contributing to US GDP growth in H1 2025

Datacenter capex is exploding
Jürgen Schmidhuber (@schmidhuberai) 's Twitter Profile Photo

Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than

Who invented convolutional neural networks (CNNs)? 

1969: Fukushima had CNN-relevant ReLUs [2].

1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than
Eliezer Yudkowsky ⏹️ (@esyudkowsky) 's Twitter Profile Photo

Also according to the NYT, Claude and Gemini had similar response patterns if you simulated dropping them into the middle of that conversational context. That said, it was ultimately Gemini that called bullshit on the whole thing and snapped the guy out of it.

TuringPost (@theturingpost) 's Twitter Profile Photo

GRPO vs GSPO, or DeepSeek vs Qwen - a workflow breakdown of the main Chinese reinforcement learning algorithms ➡️ Group Relative Policy Optimization (GRPO): Learning by comparison GRPO is tailored for reasoning-heavy tasks where relative quality matters more than absolute

Daniel Kang (@daniel_d_kang) 's Twitter Profile Photo

The prevailing wisdom is that compute is the most important factor for frontier AI training. We think this is wrong: data is the most costly and important component of AI training. We collected estimates of revenue for major data labeling companies and compared them with the

The prevailing wisdom is that compute is the most important factor for frontier AI training. We think this is wrong: data is the most costly and important component of AI training.

We collected estimates of revenue for major data labeling companies and compared them with the
Greg Kamradt (@gregkamradt) 's Twitter Profile Photo

What makes the HRM model work so well for its size on ARC Prize? We ran ablation experiments to find out what made it work Our findings show that you could replace the "hierarchical" architecture with a normal transformer with only a small performance drop We found that an

What makes the HRM model work so well for its size on <a href="/arcprize/">ARC Prize</a>?

We ran ablation experiments to find out what made it work

Our findings show that you could replace the "hierarchical" architecture with a normal transformer with only a small performance drop

We found that an
Zephyr (@zephyr_z9) 's Twitter Profile Photo

Huawei’s datacenter and chip strategy is centered on achieving AI hardware self-sufficiency through a vertically integrated, full-stack ecosystem. -It involves in-house development of processors (Ascend NPUs and Kunpeng CPUs), a proprietary interconnect protocol (Unified Bus),

Eugene Yan (@eugeneyan) 's Twitter Profile Photo

after leading a few projects, i've found that once you've set up the evals + experiment harness and make it easy to tweak config and prompts with 1-click run + eval, teams enjoy running experiments and hill climbing those numbers, and progress comes quickly. but setting up that

Guive Assadi (@guiveassadi) 's Twitter Profile Photo

Why does Claude sometimes claim to have lived in San Francisco and married a Japanese woman? Why did Grok briefly love Hitler? Models infer their personas from cultural cues in their fine-tuning data. Article linked in the replies.

Why does Claude sometimes claim to have lived in San Francisco and married a Japanese woman? Why did Grok briefly love Hitler? Models infer their personas from cultural cues in their fine-tuning data. Article linked in the replies.
vx-underground (@vxunderground) 's Twitter Profile Photo

Big drama today in the Tor community. Conrad Rockenhaus, a Tor operator based out of Michigan, United States, was arrested in 2020 after refusing to cooperate with the United States Federal Bureau of Investigation Rockenhaus, a disabled United States military veteran, ran the

Dylan Patel ✈️ ICLR (@dylan522p) 's Twitter Profile Photo

OpenAI hasn’t even deployed TPUs yet and they’ve already saved ~30% on their entire lab wide NVIDIA fleet. This demonstrates how the perf per TCO advantage of TPUs is so strong that you already get the gains from adopting TPUs even before turning one on. The piece covers a lot

OpenAI hasn’t even deployed TPUs yet and they’ve already saved ~30% on their entire lab wide NVIDIA fleet. This demonstrates how the perf per TCO advantage of TPUs is so strong that you already get the gains from adopting TPUs even before turning one on.
The piece covers a lot
Acer (@acerfur) 's Twitter Profile Photo

As a bonus, Erdős problem #729 has also been fully autonomously resolved by GPT-5.2 Pro with Aristotle. Note that literature review is still ongoing.

As a bonus, Erdős problem #729 has also been fully autonomously resolved by GPT-5.2 Pro with Aristotle.

Note that literature review is still ongoing.