Zhiheng LYU (@zhihenglyu) 's Twitter Profile
Zhiheng LYU

@zhihenglyu

MMath Student @UWaterloo TIGER-Lab
Prev @HKUniversity @ETH_en @UCBerkeley

ID: 1529087177845616640

linkhttp://cogito233.github.io calendar_today24-05-2022 13:09:45

23 Tweet

117 Followers

337 Following

Cong Wei (@congwei1230) 's Twitter Profile Photo

🚀Thrilled to introduce ☕️MoCha: Towards Movie-Grade Talking Character Synthesis Please unmute to hear the demo audio. ✨We defined a novel task: Talking Characters, which aims to generate character animations directly from Natural Language and Speech input. ✨We propose

Kevin Yang (@kevinyang41) 's Twitter Profile Photo

Will be at NAACL next week, excited to share two of our papers: FACTTRACK: Time-Aware World State Tracking in Story Outlines arxiv.org/abs/2407.16347 THOUGHTSCULPT: Reasoning with Intermediate Revision and Search arxiv.org/abs/2404.05966 Shoutout to first authors Zhiheng LYU and

Dongfu Jiang (@dongfujiang) 's Twitter Profile Photo

Introducing VerlTool - a unified and easy-to-extend tool agent training framework based on verl. Recently, there's been a growing trend toward training tool agents with reinforcement learning algorithms like GRPO and PPO. Representative works include SearchR1, ToRL, ReTool, and

Introducing VerlTool - a unified and easy-to-extend tool agent training framework based on verl.

Recently, there's been a growing trend toward training tool agents with reinforcement learning algorithms like GRPO and PPO. Representative works include SearchR1, ToRL, ReTool, and
Yuansheng Ni (@yuanshengni) 's Twitter Profile Photo

📢 Introducing VisCoder – fine-tuned language models for Python-based visualization code generation and feedback-driven self-debugging. Existing LLMs struggle to generate reliable plotting code: outputs often raise exceptions, produce blank visuals, or fail to reflect the

MiniMax (official) (@minimax__ai) 's Twitter Profile Photo

Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning. - World’s longest context window: 1M-token input, 80k-token output - State-of-the-art agentic use among open-source models - RL at unmatched efficiency:

Day 1/5 of #MiniMaxWeek: We’re open-sourcing MiniMax-M1, our latest LLM — setting new standards in long-context reasoning.

- World’s longest context window: 1M-token input, 80k-token output
- State-of-the-art agentic use among open-source models
- RL at unmatched efficiency:
Dongfu Jiang (@dongfujiang) 's Twitter Profile Photo

🚀 Excited to finally share our paper on VerlTool, released today after months of work since the initial release in late May! VerlTool is a high-efficiency, easy-to-use framework for Agentic RL with Tool use (ARLT), built on top of VeRL. It currently supports a wide range of

🚀 Excited to finally share our paper on VerlTool, released today after months of work since the initial release in late May!

VerlTool is a high-efficiency, easy-to-use framework for Agentic RL with Tool use (ARLT), built on top of VeRL. It currently supports a wide range of
Wenhu Chen (@wenhuchen) 's Twitter Profile Photo

Totally agree. We experimented with only-image input for every task. The results are quite good. Checkout our early paper PixelWorld: arxiv.org/abs/2501.19339

Wenhu Chen (@wenhuchen) 's Twitter Profile Photo

# NewDataset for VLMs After the release of VisualWebInstruct, we kept pushing its quality and adopting different strategies to make it as accurate as possible. Today, we are releasing a verified version of VisualWebInstruct under huggingface.co/datasets/TIGER…. It has around 100K

MiniMax (official) (@minimax__ai) 's Twitter Profile Photo

We’re open-sourcing MiniMax M2 — Agent & Code Native, at 8% Claude Sonnet price, ~2x faster ⚡ Global FREE for a limited time via MiniMax Agent & API - Advanced Coding Capability: Engineered for end-to-end developer workflows. Strong capability on a wide-range of applications

We’re open-sourcing MiniMax M2 — Agent & Code Native, at 8% Claude Sonnet price, ~2x faster
⚡ Global FREE for a limited time via MiniMax Agent & API
- Advanced Coding Capability: Engineered for end-to-end developer workflows. Strong capability on a wide-range of applications
Yuansheng Ni (@yuanshengni) 's Twitter Profile Photo

📢 Introducing VisCoder2: Building Multi-Language Visualization Coding Agents! Existing LLMs often fail in practical workflows due to limited language coverage, unreliable execution, and a lack of iterative correction mechanisms. We introduce 3 resources to address this:

📢 Introducing VisCoder2: Building Multi-Language Visualization Coding Agents!
Existing LLMs often fail in practical workflows due to limited language coverage, unreliable execution, and a lack of iterative correction mechanisms.
We introduce 3 resources to address this:
Yuntian Deng (@yuntiandeng) 's Twitter Profile Photo

My student Wentao reproduced Self-Adapting LMs and wrote a blog on lessons learned. Highly recommended for anyone adapting LMs! He's also looking for a summer internship. He has 2 first-author EMNLP papers after just one year! 🔗aggregativeqa.com/dataview 🔗interactivetraining.ai

Hanqi Yan (@yan_hanqi) 's Twitter Profile Photo

🚀 Thrilled to announce that I’ll be attending EMNLP 2025 (4Nov-9Nov) in Suzhou, China! 🇨🇳✨ I’ll be showcasing our latest research from #KCLNLP on implicit Chain-of-Thoughts (CoTs) and an AI Scientist demo system 🤖🧠 📘 CODI: Compressing Chain-of-Thought into Continuous Space

Jiarui Liu (@jiarui_liu_) 's Twitter Profile Photo

Our EMNLP 2025 paper "Synthetic Socratic Debates" is presenting today in Suzhou! 📍 Poster Session 1 🕚 Nov 5, 11:00 AM (Beijing) Come chat about how LLM personas shape moral reasoning & persuasion! 🔗 arxiv.org/abs/2506.12657

Lingming Zhang (@lingmingzhang) 's Twitter Profile Photo

🤯🤯🤯 Gemini 3 Pro + Live-SWE-agent hits 77.4% on SWE-bench Verified, beating ALL existing models, including Claude 4.5!! 🤖 Live-SWE-agent is the first live software agent that autonomously self-evolves on the fly — and it even outperforms the manually engineered scaffold

🤯🤯🤯 Gemini 3 Pro + Live-SWE-agent hits 77.4% on SWE-bench Verified, beating ALL existing models, including Claude 4.5!! 

🤖 Live-SWE-agent is the first live software agent that autonomously self-evolves on the fly — and it even outperforms the manually engineered scaffold