Mojan Javaheripi (@mojan_jp) 's Twitter Profile
Mojan Javaheripi

@mojan_jp

Senior Researcher @MSFTResearch working on physics of LLMs. Phi pretraining. CE PhD from @UCSanDiego

ID: 1199219580998045696

linkhttp://acsweb.ucsd.edu/~mojavahe calendar_today26-11-2019 06:53:34

28 Tweet

279 Followers

125 Following

Sebastien Bubeck (@sebastienbubeck) 's Twitter Profile Photo

Enjoy everyone! (And remember it's a base model so you might have to play around with your prompts; if you want it to follow instructions you can try the format "Instruct:... Ouput:") huggingface.co/microsoft/phi-2

Sebastien Bubeck (@sebastienbubeck) 's Twitter Profile Photo

phi-3 is here, and it's ... good :-). I made a quick short demo to give you a feel of what phi-3-mini (3.8B) can do. Stay tuned for the open weights release and more announcements tomorrow morning! (And ofc this wouldn't be complete without the usual table of benchmarks!)

Peter Lee (@peteratmsr) 's Twitter Profile Photo

🚀 Phi-4 is here! A small language model that performs as well as (and often better than) large models on certain types of complex reasoning tasks such as math. Useful for us in Microsoft Research, and available now for all researcher on the Azure AI Foundry! aka.ms/phi4blog

🚀 Phi-4 is here! A small language model that performs as well as (and often better than) large models on certain types of complex reasoning tasks such as math. Useful for us in <a href="/MSFTResearch/">Microsoft Research</a>, and available now for all researcher on the Azure AI Foundry! aka.ms/phi4blog
Sebastien Bubeck (@sebastienbubeck) 's Twitter Profile Photo

Surprise #NeurIPS2024 drop for y'all: phi-4 available open weights and with amazing results!!! Tl;dr: phi-4 is in Llama 3.3-70B category (win some lose some) with 5x fewer parameters, and notably outperforms on pure reasoning like GPQA (56%) and MATH (80%).

Surprise #NeurIPS2024 drop for y'all: phi-4 available open weights and with amazing results!!!

Tl;dr: phi-4 is in Llama 3.3-70B category (win some lose some) with 5x fewer parameters, and notably outperforms on pure reasoning like GPQA (56%) and MATH (80%).
Shital Shah (@sytelus) 's Twitter Profile Photo

Are you ready for an early Christmas present from our team at Microsoft Research? Introducing the most powerful smol model ever built in the world! Welcome to Phi-4! 👇

Are you ready for an early Christmas present from our team at Microsoft Research?

Introducing the most powerful smol model ever built in the world!

Welcome to Phi-4! 👇
Mojan Javaheripi (@mojan_jp) 's Twitter Profile Photo

Excited to see our SLM work, Phi, mentioned in MIT Technology Review as top 10 breakthrough technologies! 😊 technologyreview.com/2025/01/03/110…

Ahmed Awadallah (@ahmedhawadallah) 's Twitter Profile Photo

Introducing Phi-4-reasoning, adding reasoning models to the Phi family of SLMs. The model is trained with both supervised finetuning (using a carefully curated dataset of reasoning demonstration) and Reinforcement Learning. 📌Competitive results on reasoning benchmarks with

Introducing Phi-4-reasoning, adding reasoning models to the Phi family of SLMs.

The model is trained with both supervised finetuning (using a carefully curated dataset of reasoning demonstration) and Reinforcement Learning.

📌Competitive results on reasoning benchmarks with
Suriya Gunasekar (@suriyagnskr) 's Twitter Profile Photo

In all, we SFT’ed on ~1.4M reasoning traces on select prompts and further RL'd on a small ~6k sample. Despite the relatively long SFT on select domains, we see broad generalization across domains and no degradation in general purpose performance. On the contrary....🔁📚

Sebastien Bubeck (@sebastienbubeck) 's Twitter Profile Photo

wow phi-4-reasoning with its mere 14B parameters beats deepseek-R1 and its 671B parameters (on AIME25). So data quality matters you tell me? 😁

wow phi-4-reasoning with its mere 14B parameters beats deepseek-R1 and its 671B parameters (on AIME25). So data quality matters you tell me? 😁
Ece Kamar (@ecekamar) 's Twitter Profile Photo

Excited to share our latest Phi model, Phi4-reasoning, a small but powerful model that match the performance of much larger reasoning models up to DeepSeek R1. Here is the report for new insights into training reasoning models and evaluating them: lnkd.in/g_Pz5JQA

Mojan Javaheripi (@mojan_jp) 's Twitter Profile Photo

Great to see the additive dataset methodology we proposed in Phi-4-reasoning adopted in open-r1. Tldr: optimize data mixture per reasoning domain, and combine in final run for generalized performance. This is a game changer for reducing data ablation costs.