Get the latest tech news

Phi-4 Reasoning Models


Microsoft continues to add to the conversation by unveiling its newest models, Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning. Learn more.

Accuracy of models across general-purpose benchmarks for: long input context QA (FlenQA), instruction following (IFEval), Coding (HumanEvalPlus), knowledge & language understanding (MMLUPro), safety detection (ToxiGen), and other general skills (ArenaHard and PhiBench). It is used in core experiences like Click to Do providing useful text intelligence tools for any content on your screen and is available as developer APIs to be readily integrated into applications—already being used in several productivity applications like Outlook offering its Copilot summary features offline. The Phi family of models has adopted a robust safety post-training approach, leveraging a combination of Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning from Human Feedback (RLHF) techniques.

Get the Android app

Or read this on Hacker News

Read more on:

Photo of reasoning models

reasoning models

Related news:

News photo

Don’t believe reasoning models’ Chains of Thought, says Anthropic

News photo

Reasoning models don't always say what they think

News photo

Beyond RAG: SEARCH-R1 integrates search engines directly into reasoning models