Pengfei Liu (@stefan_fee) 's Twitter Profile
Pengfei Liu

@stefan_fee

Associate Prof. at SJTU, leading GAIR Lab (plms.ai) Co-founder of Inspired Cognition, Postdoc at @LTIatCMU, Previously FNLP, @MILAMontreal,

ID: 2818867628

linkhttp://pfliu.com/ calendar_today19-09-2014 02:34:24

450 Tweet

3,3K Takipçi

751 Takip Edilen

Pengfei Liu (@stefan_fee) 's Twitter Profile Photo

📣 New Discovery on Computer Use Agent With just 312 high-quality trajectories + open-source model, we've surpassed Claude 3.7 Sonnet (thinking) in computer use capabilities 🚀 ⚡️ In the new era of AI Agent training, many key questions remain: • Can open-source models + small

Pengfei Liu (@stefan_fee) 's Twitter Profile Photo

312 quality trajectories + open-source model beats Claude 3.7 Sonnet (thinking) in computer use 🚀 We answer the following important questions in our recent tech report: github.com/GAIR-NLP/PC-Ag… 1. Can open-source models + small high-quality datasets outperform top closed-source

Pengfei Liu (@stefan_fee) 's Twitter Profile Photo

The real breakthrough isn't better AI—it's breaking free from nature's constraints We're witnessing a paradigm shift from "passive adaptation" to "active construction" in AI training. 🌊 The old way: AI learns from whatever data naturally exists • Constrained by existing

The real breakthrough isn't better AI—it's breaking free from nature's constraints

We're witnessing a paradigm shift from "passive adaptation" to "active construction" in AI training.

🌊 The old way: AI learns from whatever data naturally exists
• Constrained by existing
Pengfei Liu (@stefan_fee) 's Twitter Profile Photo

What foundation models do we REALLY need for the RL era? And what pre-training data? Excited to share our work: OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling arxiv.org/pdf/2506.20512 ✨ Key breakthroughs: - First RL-focused mid-training approach - Llama

Pengfei Liu (@stefan_fee) 's Twitter Profile Photo

Tech history: Every time humanity hits a tech wall, we just wait for someone named Ilya to show up and save the world :) - Neural nets stuck? - Language models plateau? - ... (skip tons of stuff) - ... - Superintelligence coming?

Ethan Chern (@ethanchern) 's Twitter Profile Photo

FacTool has been accepted to COLM 2025 - two years after its arXiv debut! While the landscape of LLMs has changed a lot since then, tool-augmented LLMs and RAG are still among the most effective and practical approaches for detecting / mitigating hallucinations (ref:

Yujia Qin@ICLR2025 (@tsingyoga) 's Twitter Profile Photo

We can finally share UI-TARS-2🥳🥳 — a native GUI agent trained with multi-turn agent RL ⚡️⚡️Key highlights (all-in-one model!): 💻Computer Use: 47.5 OSWorld · 50.6 WindowsAgentArena 📱Phone Use: 73.3 AndroidWorld 🛜Browser Use: 88.2% Online-Mind2Web 🎮Gameplay: ~60% human

We can finally share UI-TARS-2🥳🥳  — a native GUI agent trained with multi-turn agent RL

⚡️⚡️Key highlights (all-in-one model!):

💻Computer Use:  47.5 OSWorld · 50.6 WindowsAgentArena

📱Phone Use: 73.3 AndroidWorld

🛜Browser Use: 88.2% Online-Mind2Web

🎮Gameplay: ~60% human