Skip to content

Microsoft's New MAI Models Cut the Cost of Building AI Automations

June 7, 2026. At Build 2026 in early June, Microsoft introduced its own family of frontier models, a move that matters less for the leaderboard and more for what it does to the price of building AI into a business. Two releases stand out for operators: MAI-Thinking-1 for reasoning and MAI-Code-1-Flash for everyday coding.

What Microsoft launched

  1. MAI-Thinking-1, described as Microsoft's first in-house reasoning model: a 35B active-parameter mixture-of-experts with a 256K context window. Microsoft reports 97 percent on AIME 25 and 53 percent on SWE-Bench Pro, placing it near Opus 4.6 on a hard coding benchmark, and says human raters preferred it over Claude Sonnet 4.6 in blind comparisons. It is in private preview through Microsoft Foundry.
  2. MAI-Code-1-Flash, a fast coding model that Microsoft says solves harder problems with up to 60 percent fewer tokens and beats Claude Haiku 4.5 on price to performance. It is rolling out to about 10 percent of users to start and can be routed automatically when you pick Auto in the VS Code model picker.
  3. A broader set of seven new models across image, voice, transcription, thinking, and coding, with availability extending beyond Microsoft to Fireworks AI, Baseten, and OpenRouter.

What it means for operators

You do not need to care which lab wins to benefit from this. More capable models competing on token cost is good news for anyone running automations, because the same workflow gets cheaper or the budget buys more volume. The practical lesson is not to pick a single model and marry it. The durable pattern is multi-model routing: send simple, high-volume steps to a cheap fast model, reserve a top reasoning model for the hard calls, and measure cost and quality per task rather than per vendor. That keeps you flexible when the next release shifts the price curve again, which on the current cadence will be within weeks.

Choosing and wiring the right model per task is exactly the engineering we do. If you want automations built to be cost-efficient and model-agnostic, see our AI automation service or hire an AI engineer from our team. Primary source: Microsoft AI; independent coverage from Neowin.

Want automations built on the most cost-efficient AI?

We design, build, and run it for you, integrated with the tools you already use. Free audit in 24 hours.

Get Your Free Audit

Frequently Asked Questions

They are Microsoft's own AI models announced at Build 2026. MAI-Thinking-1 is a reasoning model (a 35B active-parameter mixture-of-experts with a 256K context window) in private preview through Microsoft Foundry. MAI-Code-1-Flash is a fast coding model aimed at everyday developer workflows.

MAI-Thinking-1 is in private preview via Microsoft Foundry, and MAI-Code-1-Flash is rolling out gradually, including through the Auto option in the VS Code model picker. Microsoft says the models are also available through Fireworks AI, Baseten, and OpenRouter.

More strong models competing on token cost tends to push prices down. If a workflow can run on a cheaper, faster model for routine steps, the same automation costs less or handles more volume for the same budget.

Not as a blanket move. The reliable approach is multi-model routing: use a cheap fast model for simple, high-volume tasks and reserve a top reasoning model for the hard ones, measuring cost and quality per task rather than committing to one vendor.

Free Strategy Audit

Ready to put this to work?

Join 500+ businesses already scaling with AI and automation. Get your free audit and a custom roadmap within 48 hours.

Website & marketing performance analysis
AI & automation opportunity mapping
Custom growth roadmap with ROI estimates
Delivered within 48 hours, 100% free
500+
Clients served
48hr
Turnaround
100%
Free, no strings

Get Your Free Audit

Takes 30 seconds. No credit card required.

Prefer to chat?

WhatsApp us