Tianqing Fang @ ACL24 (@tfang229) Twitter Tweets • TwiCopy

Tianqing Fang @ ACL24

@tfang229

+ Follow

Incoming Research Scientist @TencentGlobal AI Lab. PhD. @hkust @HKUSTKnowComp #NLProc. Ex. visiting Ph.D. @epfl_en. Ex visiting scholar at LUKA lab @CSatUSC.

ID: 1641361658269540352

linkhttp://fangtq.com calendar_today30-03-2023 08:48:23

34 Tweet

226 Followers

272 Following

Jianshu Zhang ✈️ICLR2025🇸🇬

@sterzhang

5 months ago

My earlier work #VLM2Bench called for clearer principles on *when* language aids vision. Our new work **MindCube**: *First Map then Reasoning* puts this into practice—tackling limited-view spatial reasoning with appropriate language-side mapping. Welcome check it out! 🚀

thumb_up_off_alt14

chat_bubble_outline1

repeat4

shareShare

Wenhao Yu

@wyu_nd

4 months ago

📢 New paper alert 📢 We introduce MobileGUI-RL, an RL framework advancing mobile GUI agents through trajectory-based rollouts and rewards in 𝗼𝗻𝗹𝗶𝗻𝗲 environments. With RL, Qwen 2.5-VL achieves 44.8% Success on Android World! ✨ Checkout paper at: arxiv.org/abs/2507.05720

thumb_up_off_alt77

chat_bubble_outline0

repeat23

shareShare

Zhenwen Liang

@liangzhenwen

4 months ago

🚀 Thrilled to share our new paper at Tencent AI Lab! We introduce a framework DRP-IMO that proves 5 post-2000 IMO problems — a set where no prior open-source prover has solved a single one. 🤯 Link: arxiv.org/pdf/2507.06804 Details in thread👇

thumb_up_off_alt17

chat_bubble_outline1

repeat3

shareShare

Dong Yu

@dong_yu_ai

4 months ago

We have some interesting findings in our recent work "One Token to Fool LLM-as-a-Judge" (arxiv.org/abs/2507.08794) that will affect RLVR with generative reward models.

thumb_up_off_alt13

chat_bubble_outline0

repeat5

shareShare

Wenhao Yu

@wyu_nd

4 months ago

🗒️Have been exploring Agent-RL training over the past few months, particularly in GUI scenarios. Here’s a summary of some practical insights and lessons 🤔 learned from the perspective of an industry researcher, and some reference papers.

thumb_up_off_alt119

chat_bubble_outline2

repeat15

shareShare

Yangqiu Song

@yqsong

4 months ago

If you are at ACL, please talk to our members of KnowComp HKUST and HKUST NLP big family

If you are at ACL, please talk to our members of <a href="/HKUSTKnowComp/">KnowComp HKUST</a> and <a href="/hkustNLP/">HKUST NLP</a> big family

thumb_up_off_alt51

chat_bubble_outline0

repeat3

shareShare

DailyPapers

@huggingpapers

4 months ago

Tencent AI Lab just released Cognitive Kernel-Pro on Hugging Face! A fully open-source & free multi-module agent framework designed for deep research & agent foundation model training. Achieves state-of-the-art among open-source agents on GAIA.

thumb_up_off_alt20

chat_bubble_outline2

repeat6

shareShare

Tianqing Fang @ ACL24

@tfang229

3 months ago

Thank you for sharing our work! The code, data, and models have been open-sourced for the research community’s benefit: github.com/Tencent/Cognit… huggingface.co/CognitiveKerne… huggingface.co/datasets/Cogni…

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Wenhao Yu

@wyu_nd

3 months ago

This is a fully open-source Deep Research Agent from Tencent -- data, code, and checkpoint!

thumb_up_off_alt23

chat_bubble_outline0

repeat9

shareShare

Tianqing Fang @ ACL24

@tfang229

3 months ago

Seems the term GPT-5 refers to more than a model but a Deep(er) Research Agent 🫣

thumb_up_off_alt6

chat_bubble_outline1

repeat0

shareShare

DailyPapers

@huggingpapers

3 months ago

Tencent AI Lab introduces R-Zero! A groundbreaking framework enabling LLMs to self-evolve their reasoning capabilities from zero human-curated data, through an autonomous Challenger-Solver loop.

thumb_up_off_alt511

chat_bubble_outline8

repeat86

shareShare

DailyPapers

@huggingpapers

3 months ago

Microsoft just released Agent Lightning on Hugging Face. Train ANY AI agents with Reinforcement Learning with almost ZERO code change! A flexible and extensible framework that fully decouples agents from RL training.

thumb_up_off_alt83

chat_bubble_outline3

repeat13

shareShare

Wenhao Yu

@wyu_nd

3 months ago

𝑳𝑳𝑴𝒔 can really 𝑺𝒆𝒍𝒇-𝑬𝒗𝒐𝒍𝒗𝒆, 𝒘𝒊𝒕𝒉𝒐𝒖𝒕 𝑯𝒖𝒎𝒂𝒏 𝑫𝒂𝒕𝒂! -- One LLM, two roles: Challenger creates tasks, Solver answers them. -- No data, no labels, just a base model that learns and improves itself! We name it 𝑹-𝒛𝒆𝒓𝒐: arxiv.org/abs/2508.05004

thumb_up_off_alt883

chat_bubble_outline17

repeat158

shareShare

Wenhao Yu

@wyu_nd

3 months ago

WebEvolver is now accepted to EMNLP 2025. EMNLP 2025 A self-evolving web agent capable of look-ahead simulation.

thumb_up_off_alt62

chat_bubble_outline0

repeat8

shareShare

Shizhe Diao

@shizhediao

3 months ago

This week, we open-sourced NVIDIA-Nemotron-Nano-v2-9B: our next-generation efficient hybrid model. - 6× faster than Qwen3-8B at reasoning tasks. - Retained long-context capability (8k → 262k trained, usable at 128k) First true demonstration that reasoning models can be

thumb_up_off_alt136

chat_bubble_outline6

repeat24

shareShare

Tianqing Fang @ ACL24

@tfang229

3 months ago

WebEvolver is accepted by #EMNLP2025 main conference. See you in Suzhou, China! Code: github.com/Tencent/SelfEv… Paper: arxiv.org/abs/2504.21024

thumb_up_off_alt12

chat_bubble_outline0

repeat2

shareShare

Wenhao Yu

@wyu_nd

3 months ago

New paper: VLMs can self-reward during RL training — no visual annotations needed! -- Decompose VLM reasoning into visual vs. language parts -- Prompt the same VLM without visual input for visual reward We call it 𝐕𝐢𝐬𝐢𝐨𝐧-𝐒(𝐞𝐥𝐟)𝐑𝟏: arxiv.org/abs/2508.19652

thumb_up_off_alt449

chat_bubble_outline7

repeat91

shareShare