Lei Cui (@wolfshowme) 's Twitter Profile
Lei Cui

@wolfshowme

NLP Researcher at Microsoft Research, retrogaming and console fans

ID: 87653950

linkhttps://www.microsoft.com/en-us/research/people/lecu/ calendar_today05-11-2009 09:52:49

317 Tweet

305 Followers

349 Following

elvis (@omarsar0) 's Twitter Profile Photo

Visualization-of-Thought Elicits Spatial Reasoning in LLMs Inspired by a human cognitive capacity to imagine unseen worlds, this new work proposes Visualization-of-Thought (VoT) prompting to elicit spatial reasoning in LLMs. VoT enables LLMs to "visualize" their reasoning

Visualization-of-Thought Elicits Spatial Reasoning in LLMs

Inspired by a human cognitive capacity to imagine unseen worlds, this new work proposes Visualization-of-Thought (VoT) prompting to elicit spatial reasoning in LLMs. 

VoT enables LLMs to "visualize" their reasoning
AGI (@agi2025) 's Twitter Profile Photo

Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models VoT prompting enhances LLMs' spatial reasoning, enabling them to outperform MLLMs in spatial tasks. arxiv.org/abs/2404.03622

Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models

VoT prompting enhances LLMs' spatial reasoning, enabling them to outperform MLLMs in spatial tasks.

arxiv.org/abs/2404.03622
Mohamed (@mekkcyber) 's Twitter Profile Photo

🚀 Exciting news! We’ve finally cracked the code for BitNet Hugging Face ! no pre-training needed! With just fine-tuning a Llama 3 8B, we've achieved great results, reaching a performance close to Llama 1 & 2 7B models on key downstream tasks! Want to learn more? Check out the

🚀 Exciting news! We’ve finally cracked the code for BitNet <a href="/huggingface/">Hugging Face</a> ! no pre-training needed! With just fine-tuning a Llama 3 8B, we've achieved great results, reaching a performance close to Llama 1 &amp; 2 7B models on key downstream tasks!

Want to learn more? Check out the
𝚐𝔪𝟾𝚡𝚡𝟾 (@gm8xx8) 's Twitter Profile Photo

RedStone: Curating General, Code, Math, and QA Data for Large Language Models 🔗: github.com/microsoft/RedS… paper: arxiv.org/abs/2412.03398

Sachin Kumar (@sachinkr_ai) 's Twitter Profile Photo

RedStone: data pipeline designed to create specialized large-scale datasets by leveraging the vast and diverse data from Common Crawl. This paper from Microsoft introduce REDSTONE, an innovative and scalable pipeline engineered to extract and process data from Common Crawl,

RedStone: data pipeline designed to create specialized large-scale datasets by leveraging the vast and diverse data from Common Crawl.
This paper from Microsoft introduce REDSTONE, an innovative and scalable pipeline engineered to extract and process data from Common Crawl,
Microsoft Research (@msftresearch) 's Twitter Profile Photo

In this issue of Research Focus, we examine a new conversation segmentation method that delivers more coherent and personalized agent conversation, and we review efforts to improve MLLMs’ understanding of geologic maps. Check out the latest research: msft.it/6019q9k33

In this issue of Research Focus, we examine a new conversation segmentation method that delivers more coherent and personalized agent conversation, and we review efforts to improve MLLMs’ understanding of geologic maps. Check out the latest research: msft.it/6019q9k33
Lei Cui (@wolfshowme) 's Twitter Profile Photo

Boosting consistency in generative games! Our "Model as a Game" paper introduces novel LogicNet & external map modules for numerical/spatial coherence w/ low overhead. See results in Traveler, Pong, Pac-Man. #GenAI #GameDev #MaaG Preprint: arxiv.org/abs/2503.21172

Boosting consistency in generative games! Our "Model as a Game" paper introduces novel LogicNet &amp; external map modules for numerical/spatial coherence w/ low overhead. See results in Traveler, Pong, Pac-Man.
#GenAI #GameDev #MaaG

Preprint: arxiv.org/abs/2503.21172
FW (@thegenerality) 's Twitter Profile Photo

This is the first (small) large-scale training of native 1-bit LLMs / BitNet b1.58. More are coming soon including BitNet v2.

Rosinality (@rosinality) 's Twitter Profile Photo

Geometric-Mean Policy Optimization Using geometric mean for the importance ratio, similar to GSPO (arxiv.org/abs/2507.18071).

Geometric-Mean Policy Optimization

Using geometric mean for the importance ratio, similar to GSPO (arxiv.org/abs/2507.18071).
Lei Cui (@wolfshowme) 's Twitter Profile Photo

New paper: #GMPO beats GRPO by simply switching from arithmetic → geometric mean for token rewards! ✅ More stable training (no extreme importance sampling ratios) ✅ Better exploration (higher entropy throughout training) huggingface.co/papers/2507.20…

New paper: #GMPO beats GRPO by simply switching from arithmetic → geometric mean for token rewards!

✅ More stable training (no extreme importance sampling ratios)
✅ Better exploration (higher entropy throughout training)

huggingface.co/papers/2507.20…
Remek Kinas (@kinasremek) 's Twitter Profile Photo

RL(LLM) - Pisałem ostatnio o GSPO. A dzisiaj publikacje na temat -> GMPO - Geometric-Mean Policy Optimization, ARPO - Agentic Reinforced Policy Optimization, IRL - Inverse RL … Chyba najbardziej kwitnący obszar treningowy LLM. U nas Bielik-v3 też już trenowany RL (GRPO,

fly51fly (@fly51fly) 's Twitter Profile Photo

[CL] Geometric-Mean Policy Optimization Y Zhao, Y Liu, J Liu, J Chen... [Microsoft Research] (2025) arxiv.org/abs/2507.20673

[CL] Geometric-Mean Policy Optimization
Y Zhao, Y Liu, J Liu, J Chen... [Microsoft Research] (2025)
arxiv.org/abs/2507.20673
AI Native Foundation (@ainativef) 's Twitter Profile Photo

8. Geometric-Mean Policy Optimization 🔑 Keywords: Geometric-Mean Policy Optimization, Policy Updates, Token-Level Rewards, Multimodal Reasoning, AI Native 💡 Category: Natural Language Processing 🌟 Research Objective: - The research aims to stabilize policy updates in

8. Geometric-Mean Policy Optimization

🔑 Keywords: Geometric-Mean Policy Optimization, Policy Updates, Token-Level Rewards, Multimodal Reasoning, AI Native

💡 Category: Natural Language Processing

🌟 Research Objective:
   - The research aims to stabilize policy updates in
DailyPapers (@huggingpapers) 's Twitter Profile Photo

Microsoft Research introduces Geometric-Mean Policy Optimization (GMPO)! A new RL method that stabilizes LLM reasoning by maximizing the geometric mean of token-level rewards. No more unstable updates!

Microsoft Research introduces Geometric-Mean Policy Optimization (GMPO)!

A new RL method that stabilizes LLM reasoning by maximizing the geometric mean of token-level rewards.

No more unstable updates!
DailyPapers (@huggingpapers) 's Twitter Profile Photo

GMPO outperforms GRPO by 4.1% on math & 1.4% on multimodal reasoning benchmarks. It achieves better stability and performance, moving us closer to reliable AI. Learn more & get the code: Paper: huggingface.co/papers/2507.20… Code: github.com/callsys/GMPO

DAIR.AI (@dair_ai) 's Twitter Profile Photo

Top AI Papers of The Week (July 28 - August 3): - GEPA - Graph-R1 - AlphaEarth - Self-Evolving Agents - Hierarchical Reasoning Model - Efficient Attention Mechanisms - Geometric-Mean Policy Optimization Read on for more: