Eric Chen (@chvlylchen) Twitter Tweets • TwiCopy

Richard Sutton

a year ago

David Silver really hits it out of the park in this podcast. The paper "Welcome to the Era of Experience" is here: goo.gle/3EiRKIH.

thumb_up_off_alt985

chat_bubble_outline19

repeat178

shareShare

I finally wrote another blogpost: ysymyth.github.io/The-Second-Hal… AI just keeps getting better over time, but NOW is a special moment that i call “the halftime”. Before it, training > eval. After it, eval > training. The reason: RL finally works. Lmk ur feedback so I’ll polish it.

thumb_up_off_alt950

chat_bubble_outline35

repeat184

shareShare

James Zou

@james_y_zou

a year ago

Can LLMs learn to reason better by "cheating"?🤯 Excited to introduce #cheatsheet: a dynamic memory module enabling LLMs to learn + reuse insights from tackling previous problems 🎯Claude3.5 23% ➡️ 50% AIME 2024 🎯GPT4o 10% ➡️ 99% on Game of 24 Great job Mirac Suzgun w/ awesome

thumb_up_off_alt253

chat_bubble_outline9

repeat37

shareShare

AK

@_akhaliq

a year ago

Google presents How new data permeates LLM knowledge and how to dilute it

thumb_up_off_alt426

chat_bubble_outline11

repeat71

shareShare

Wenhu Chen

@wenhuchen

a year ago

🚀 General-Reasoner: Generalizing LLM Reasoning Across All Domains (Beyond Math) Most recent RL/R1 works focus on math reasoning—but math-only tuning doesn't generalize to general reasoning (e.g. drop on MMLU-Pro and SuperGPQA). Why are we limited to math reasoning? 1. Existing

thumb_up_off_alt332

chat_bubble_outline8

repeat76

shareShare

Nando de Freitas

@nandodf

a year ago

x.com/i/article/1915…

thumb_up_off_alt108

chat_bubble_outline5

repeat13

shareShare

elvis

@omarsar0

a year ago

Building Production-Ready AI Agents with Scalable Long-Term Memory Memory is one of the most challenging bits of building production-ready agentic systems. Lots of goodies in this paper. Here is my breakdown:

thumb_up_off_alt1,1K

chat_bubble_outline21

repeat225

shareShare

TuringPost

@theturingpost

a year ago

.Google AI and Carnegie Mellon University proposed an unusual trick to make models' answers creative, especially in open-ended tasks. It's a hash-conditioning method. Just add a little noise at the input stage. Instead of giving the model the same blank prompt every time, you can give it

.<a href="/GoogleAI/">Google AI</a> and <a href="/CarnegieMellon/">Carnegie Mellon University</a> proposed an unusual trick to make models' answers creative, especially in open-ended tasks. It's a hash-conditioning method.

Just add a little noise at the input stage.

Instead of giving the model the same blank prompt every time, you can give it

thumb_up_off_alt340

chat_bubble_outline7

repeat67

shareShare

Paweł Huryn

@pawelhuryn

a year ago

I see abstract AI agent architectures everywhere. But no one explains how to build them in practice. Here's a practical guide to doing it with n8n: 🧵

thumb_up_off_alt2,2K

chat_bubble_outline44

repeat358

shareShare

Lior⚡

@lioronai

a year ago

The end of Chain-of-Thought? This new reasoning method cuts inference time by 80% while keeping accuracy above 90%. Chain-of-Draft (CoD) is a new prompting strategy that replaces Chain-of-Thought outputs with short, dense drafts for each reasoning step. Achieves 91% accuracy

thumb_up_off_alt1,1K

chat_bubble_outline27

repeat176

shareShare

Rohan Paul

@rohanpaul_ai

a year ago

Github 👨‍🔧: Learn to build your Second Brain AI assistant with LLMs, agents, RAG, fine-tuning, LLMOps and AI systems techniques. → Build an agentic RAG system interacting with a personal knowledge base (Notion example provided). → Learn production-ready LLM system architecture

thumb_up_off_alt425

chat_bubble_outline10

repeat92

shareShare

AK

@_akhaliq

a year ago

Nvidia dropped Llama-Nemotron on Hugging Face Efficient Reasoning Models

thumb_up_off_alt292

chat_bubble_outline7

repeat55

shareShare

Aurimas Griciūnas

@aurimas_gr

a year ago

You must know these 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗦𝘆𝘀𝘁𝗲𝗺 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄 𝗣𝗮𝘁𝘁𝗲𝗿𝗻𝘀 as an 𝗔𝗜 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿. If you are building Agentic Systems in an Enterprise setting you will soon discover that the simplest workflow patterns work the best and bring the most business value.

thumb_up_off_alt1,1K

chat_bubble_outline17

repeat207

shareShare

Ilir Aliu - eu/acc

@iliraliu_

a year ago

One company is quietly building the autonomous infrastructure for offices, malls, and more: ✅ Executes high-contact tasks like toilets, sinks, and counters with compliant hardware ✅ Performs tool and cleaning agent swaps dynamically based on task demands ✅ Tracks complex 3D

thumb_up_off_alt505

chat_bubble_outline10

repeat89

shareShare

Sumanth

@sumanth_077

a year ago

Turn any ML paper into code repository! Paper2Code is a multi-agent LLM system that transforms a paper into a code repository. It follows a three-stage pipeline: planning, analysis, and code generation, each handled by specialized agents. 100% Open Source

thumb_up_off_alt838

chat_bubble_outline14

repeat178

shareShare

Eric Chen

@chvlylchen

10 months ago

🔍 Why LLMs can solve other complex problems after being trained only on math and code? A new paper from ByteDance might have the answer. 🧐 Why is it worth a look? • LLMs are surprisingly good at generalizing their reasoning skills across different domains, but the "how" has

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Eric Chen

@chvlylchen

10 months ago

We train LLMs on vast datasets, but are they truly "learning" or just "memorizing" what they've seen? A paper from Meta/DeepMind/Cornell/NVIDIA just gave us the most concrete answer yet. For me, the key takeaway is interesting: they've put a number on it. Here’s my breakdown of

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Eric Chen

@chvlylchen

10 months ago

Does the AI you're testing know it's being tested? What if it's just pretending to be safe during evaluations? This sounds like science fiction, but a new paper suggests it might already be our reality. I just finished reading a bombshell paper on ArXiv, and it has fundamentally

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Eric Chen

@chvlylchen

9 months ago

For me, the key takeaway from the new "Hierarchical Reasoning Model" paper is a potential paradigm shift in how we build reasoning systems. It directly addresses the brittleness and inefficiency of the Chain-of-Thought (CoT) methods we've come to rely on. Here’s the breakdown:

thumb_up_off_alt18

chat_bubble_outline1

repeat2

shareShare

Eric Chen

@chvlylchen

9 months ago

A new paper challenges the JEPA world model (Critiques of World Models, arxiv.org/abs/2507.05169)

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Eric Chen

Richard Sutton

Shunyu Yao

James Zou

AK

Wenhu Chen

Nando de Freitas

elvis

TuringPost

Paweł Huryn

Lior⚡

Rohan Paul

AK

Aurimas Griciūnas

Ilir Aliu - eu/acc

Sumanth

Eric Chen

Eric Chen

Eric Chen

Eric Chen

Eric Chen