Muthu Kumar Chandrasekaran, PhD (@muthukumarc87) Twitter Tweets • TwiCopy

Rohan Paul

a month ago

The paper shows a few reinforcement learning tweaks let small LLM agents use tools better and beat larger ones. A 4B model matches or exceeds 32B agents on hard math, science, and code tasks. Old training stitched fake tool traces together, which taught clumsy timing for tool

thumb_up_off_alt186

chat_bubble_outline5

repeat34

shareShare

Rohan Paul

@rohanpaul_ai

a month ago

This paper speeds up diffusion LLM decoding by updating the stored keys and values cache only when and where needed. Reported gains reach up to 45.1x in the longest cases. Most existing decoders recompute queries, keys, and values for every token and layer at every step, which

thumb_up_off_alt70

chat_bubble_outline5

repeat13

shareShare

Rohan Paul

@rohanpaul_ai

a month ago

👨‍🔧 Github: RAG-Anything: All-in-One RAG Framework 7.6k Stars ⭐️ All-in-One Multimodal Document Processing RAG system built on LightRAG. You can query documents containing interleaved text, visual diagrams, structured tables, and mathematical formulations through one interface.

thumb_up_off_alt968

chat_bubble_outline15

repeat140

shareShare

Data Science Dojo

@datasciencedojo

a month ago

🚨 Finally, A Scientific Definition of AGI (and It’s Not What You Think) 🚨 For years, “Artificial General Intelligence” has been the most misused and mystified term in AI. Now, a team of leading researchers, including Dan Hendrycks, Yoshua Bengio, Dawn Song, Gary Marcus, and

thumb_up_off_alt135

chat_bubble_outline9

repeat34

shareShare

Rohan Paul

@rohanpaul_ai

a month ago

Great paper on AI's recursive self-improvement. Builds a single loop that lets a search agent teach itself. One part writes new tasks, one part tries to solve them, and one part judges the answers. A 3-role loop can keep improving a search agent without human labels. The

thumb_up_off_alt165

chat_bubble_outline10

repeat36

shareShare

Adina Yakup

@adinayakup

a month ago

DeepSeek-OCR is out 🔥 huggingface.co/deepseek-ai/De… ✨High-accuracy OCR - MIT license ✨Fast GPU inference (FlashAttention 2, BF16) ✨Docs > Markdown ✨Works with transformers

thumb_up_off_alt1,1K

chat_bubble_outline21

repeat150

shareShare

Rohan Paul

@rohanpaul_ai

a month ago

The paper shows small models can reason better by retrieving step by step instructions during inference. Gives a simple recipe that turns reasoning into reusable text the model can fetch and follow They build a library of 2 part guides by clustering training questions. Each

thumb_up_off_alt61

chat_bubble_outline3

repeat19

shareShare

NVIDIA AI Developer

@nvidiaaidev

a month ago

✨ From prototype to production, the new NVIDIA DGX Spark puts GB10 Superchip performance in your hands. Compact. 4 TB ready. Game-changing speed to run your LLMs locally. Available here ➡️ marketplace.nvidia.com/en-us/develope… #SparkSomethingBig ✨

thumb_up_off_alt258

chat_bubble_outline15

repeat44

shareShare

Emily Xiao

@xiaoemily41333

a month ago

Can we train LLMs to be good prompt engineers? 🚀We propose Prompt-MII: Meta-Learning Instruction Induction for LLMs Our models out-perform strong baselines like ICL and GEPA with 13x fewer tokens. 🧵

thumb_up_off_alt311

chat_bubble_outline18

repeat43

shareShare

Jessy Lin

@realjessylin

a month ago

🧠 How can we equip LLMs with memory that allows them to continually learn new things? In our new paper with AI at Meta, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge. While full

🧠 How can we equip LLMs with memory that allows them to continually learn new things?

In our new paper with <a href="/AIatMeta/">AI at Meta</a>, we show how sparsely finetuning memory layers enables targeted updates for continual learning, w/ minimal interference with existing knowledge.

While full

thumb_up_off_alt1,1K

chat_bubble_outline32

repeat195

shareShare

Guilherme Penedo

@gui_penedo

a month ago

New dataset release: 🌐FineWiki This is an updated and better extracted version of Wikipedia, covering 325+ languages. Unlike the old dataset from 2023, we kept all the math content, tables, properly rendered templates, and extracted key facts. Examples and highlights below.

thumb_up_off_alt550

chat_bubble_outline17

repeat77

shareShare

alphaXiv

@askalphaxiv

a month ago

We used DeepSeek OCR to extract every dataset from tables/charts across 500k+ AI arXiv papers for $1000 🚀 See which benchmarks are trending and discover datasets you didn't know existed Doing the same task with Mistral OCR would've cost $7500 👀

thumb_up_off_alt2,2K

chat_bubble_outline50

repeat331

shareShare

Natural Language Processing Papers

@hei

a month ago

ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge. arxiv.org/abs/2510.18941

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

elvis

@omarsar0

a month ago

Fundamentals of Building Autonomous LLM Agents Great overview of LLM-based agents. Great if you are just getting started with AI agents. This covers the basics good.

thumb_up_off_alt622

chat_bubble_outline16

repeat120

shareShare

Sharon Y. Li

@sharonyixuanli

a month ago

Deception is one of the most concerning behaviors that advanced AI systems can display. If you are not concerned yet, this paper might change your view. We built a multi-agent framework to study: 👉 How deceptive behaviors can emerge and evolve in LLM agents during realistic

thumb_up_off_alt219

chat_bubble_outline17

repeat46

shareShare

Rohan Paul

@rohanpaul_ai

a month ago

This paper explains how to build LLM agents that can perceive, reason, remember, and act autonomously. Humans finish 72.36% of OSWorld tasks while top agents reach 42.9%, so there is a big gap. Paper gives one clear recipe and maps common failure points. Workflows run a fixed

thumb_up_off_alt178

chat_bubble_outline4

repeat38

shareShare

Rohan Paul

@rohanpaul_ai

a month ago

New AMD paper shows a simple way to add reasoning to vision language models cheaply. It reaches results close to heavy methods using 4K examples and about 2 hours of fine tuning. Most multimodal models read images, yet they struggle to connect steps and get the final answer.

thumb_up_off_alt78

chat_bubble_outline6

repeat15

shareShare

Rohan Paul

@rohanpaul_ai

a month ago

The paper shows most LLM agents break from early mistakes and offers a way to catch and fix them. Targeted debugging raises task success by up to 26%. Agents run through memory, reflection, planning, and action, and a wrong move early tends to cascade. The authors define a

thumb_up_off_alt100

chat_bubble_outline5

repeat16

shareShare

DailyPapers

@huggingpapers

a month ago

NVIDIA just released Audio Flamingo 3 on Hugging Face! This fully open, state-of-the-art Large Audio-Language Model excels at understanding & reasoning across speech, sounds, and music, setting new benchmarks on 20+ tasks. huggingface.co/nvidia/audio-f…

thumb_up_off_alt686

chat_bubble_outline7

repeat124

shareShare

Yueqi Song

@yueqi_song

23 days ago

We just built and released the largest dataset for supervised fine-tuning of agentic LMs, 1.27M trajectories (~36B tokens)! Up until now, large-scale SFT for agents is rare - not for lack of data, but because of fragmentation across heterogeneous formats, tools, and interfaces.

thumb_up_off_alt1,1K

chat_bubble_outline26

repeat168

shareShare