Ali Fartoot (@ali_fartout) Twitter Tweets • TwiCopy

Eric Jiang

@veggie_eric

8 months ago

Grok 4 is literally #1 on every single major benchmark. Read our blog: x.ai/news/grok-4

thumb_up_off_alt5,5K

chat_bubble_outline212

repeat408

shareShare

Becoming an RL diehard in the past year and thinking about RL for most of my waking hours inadvertently taught me an important lesson about how to live my own life. One of the big concepts in RL is that you always want to be “on-policy”: instead of mimicking other people’s

thumb_up_off_alt1,1K

chat_bubble_outline74

repeat129

shareShare

alphaXiv

@askalphaxiv

8 months ago

"Deep Researcher with Test-Time Diffusion" This paper treats report writing as an iterative retrieval‑augmented diffusion process that can be enhanced by component‑wise self‑evolution. This demonstrates SoTA on multi‑hop search‑and‑reasoning benchmarks.

thumb_up_off_alt214

chat_bubble_outline2

repeat26

shareShare

Leonie

@helloiamleonie

7 months ago

Apple just released Embedding Atlas: An open-source visualization tool for your embeddings. I just gave it a quick spin with some data stored in my vector database. These are my first impressions: - Nice exploration UX with hover and tool tip for single data points - Shows you

thumb_up_off_alt606

chat_bubble_outline9

repeat91

shareShare

Wen-Tse Chen

@wenzechen2

7 months ago

[0/3] 🚀 Introducing Verlog – an open-source RL framework built specifically for training long-horizon, multi-turn LLM agents. 📊 Max episode length comparison: •VeRL / RAGEN → ~10 turns •verl-agent → ~50 turns •Verlog (ours) → 400+ turns 🔥 ⚙️ Technical foundation:

thumb_up_off_alt395

chat_bubble_outline2

repeat71

shareShare

Alexia Jolicoeur-Martineau

@jm_alexia

7 months ago

Summary of the findings for the Hierarchical Reasoning Model on ARC

thumb_up_off_alt799

chat_bubble_outline13

repeat94

shareShare

Mehrdad Moghimi

@mehrdadm96

7 months ago

Keynotes from RLC 2025 are out! youtube.com/playlist?list=…

thumb_up_off_alt29

chat_bubble_outline0

repeat3

shareShare

Rohan Paul

@rohanpaul_ai

7 months ago

BRILLIANT Google DeepMind research. Even the best embeddings cannot represent all possible query-document combinations, which means some answers are mathematically impossible to recover. Reveals a sharp truth, embedding models can only capture so many pairings, and beyond that,

BRILLIANT <a href="/GoogleDeepMind/">Google DeepMind</a> research.

Even the best embeddings cannot represent all possible query-document combinations, which means some answers are mathematically impossible to recover.

Reveals a sharp truth, embedding models can only capture so many pairings, and beyond that,

thumb_up_off_alt1,1K

chat_bubble_outline37

repeat275

shareShare

Ilia

@ilialarchenko

6 months ago

Let’s talk about VLAs in robotics 🤖 (Vision-Language-Action models) A relatively new type of robotics policies that bring the power of LLMs into the real world. If you’ve seen robots folding laundry, washing dishes, or cleaning rooms – chances are they used something VLA-like.

thumb_up_off_alt880

chat_bubble_outline18

repeat85

shareShare

TuringPost

@theturingpost

5 months ago

RLAD (Reinforcement Learning with Abstraction and Deduction) trains models via RL using a 2-player setup: ▪️ An abstraction generator – proposes short, natural-language “reasoning hints” (abstractions) summarizing key facts and strategies. ▪️ A solution generator – uses them to

thumb_up_off_alt287

chat_bubble_outline11

repeat59

shareShare

Qwen

@alibaba_qwen

5 months ago

🚀 Exciting updates in Qwen Code v0.0.12–v0.0.14! ✨ What’s new? • Plan Mode: AI proposes a full implementation plan—you approve before a single line changes. • Vision Intelligence: Auto-switch to vision models (Qwen3-VL-Plus with 256K input / 32K output!) when images

thumb_up_off_alt1,1K

chat_bubble_outline43

repeat125

shareShare

Tom Dörr

@tom_doerr

5 months ago

manage Jira issues from your terminal with a text-based interface

thumb_up_off_alt320

chat_bubble_outline1

repeat42

shareShare

alphaXiv

@askalphaxiv

5 months ago

Your Base Model is Smarter Than You Think This paper proposes a way to beat the lack of generation diversity in RL without RL! By using Markov Chain Monte Carlo’s ‘power sampling’ that reuses a base LLM’s own probabilities, it’s able to beat GRPO without training & verifiers

thumb_up_off_alt211

chat_bubble_outline7

repeat32

shareShare

Weiwei Sun

@sunweiwei12

4 months ago

AI agents are supposed to collaborate with us to solve real-world problems, but can they really? Even the most advanced models can still give us frustrating moments when working with them deeply. We argue that real-world deployment requires more than productivity (e.g., task

thumb_up_off_alt293

chat_bubble_outline13

repeat57

shareShare

🔥 Matt Dancho (Business Science) 🔥

@mdancho84

4 months ago

🔥 GPT-6 may not just be smarter. It literally might be alive (in the computational sense). A new research paper, SEAL: Self-Adapting Language Models (arXiv:2506.10943), describes how an AI can continuously learn after deployment, evolving its own internal representations

thumb_up_off_alt689

chat_bubble_outline45

repeat114

shareShare

Xeophon

@thexeophon

4 months ago

Gemini be like

thumb_up_off_alt2,2K

chat_bubble_outline12

repeat75

shareShare

Ali Behrouz

@behrouz_ali

3 months ago

We keep scaling model parameters by increasing width and stacking more layers, but what if the truly missing axes for continual learning are compression and stacking the learning process? Excited to share the full version of Nested Learning, a new paradigm for continual learning

thumb_up_off_alt1,1K

chat_bubble_outline41

repeat177

shareShare

Ahmad

@theahmadosman

3 months ago

Hugging Face has released a 214-page MASTERCLASS on how to train LLMs > it’s called The Smol Training Playbook > and if want to learn how to train LLMs, > this GIFT is for you > this training bible walks you through the ENTIRE pipeline > covers every concept that matters from

thumb_up_off_alt2,2K

chat_bubble_outline33

repeat439

shareShare

Mo Lotfollahi

@mo_lotfollahi

a month ago

Mixture-of-Experts (MoE) is a powerful way to scale large language models (LLMs): instead of running the full model for every token, a router activates only a few “experts,” giving more capacity at roughly the same compute. But routing is still a sore spot. Most MoE systems use

thumb_up_off_alt428

chat_bubble_outline10

repeat76

shareShare

Ali Fartoot

Eric Jiang

Jason Wei

alphaXiv

Leonie

Wen-Tse Chen

Alexia Jolicoeur-Martineau

Mehrdad Moghimi

Rohan Paul

Ilia

TuringPost

Qwen

Tom Dörr

alphaXiv

Weiwei Sun

🔥 Matt Dancho (Business Science) 🔥

Xeophon

Ali Behrouz

Ahmad

Mo Lotfollahi