
MAI-Image 2.5 Edit API by MICROSOFT
Microsoft's flagship image-to-image editing model, enabling precise, controllable edits to existing images through natural language instructions.
MAI-Image-2.5 Edit (Image-to-Image)
MAI-Image-2.5 Edit is Microsoft's flagship image-to-image editing model, enabling precise, controllable edits to existing images through natural language instructions. Built on the same diffusion-based generative architecture as MAI-Image-2.5, it excels at surgical, targeted modifications — removing or replacing objects, updating in-image text, correcting artifacts, and adapting layouts — while preserving the original composition and visual identity. Released on June 2, 2026.
Key Capabilities
- Surgical object editing — Make targeted object edits: remove, replace, recolor, or reposition specific elements without affecting the rest of the image.
- Inpainting & outpainting — Fill in missing regions, remove unwanted content, and extend image boundaries seamlessly.
- Text updates in images — Accurately update signage, labels, packaging text, and other embedded text within images.
- Artifact cleanup — Remove motion blur, noise, compression artifacts, and other visual defects.
- Layout adaptation — Adjust composition, crop, or reframe scenes while maintaining subject consistency.
- Face and identity consistency — Preserve facial features and identity across edits and modifications.
- Visual reasoning — Understands spatial relationships, lighting, and scene structure to apply edits that look natural and coherent.
- High-fidelity portraits — Generates or edits expressive, natural-looking portraits with accurate facial structure and lighting.
- Accurate text rendering — Maintains or improves text legibility in labels, posters, and signage.
- Product, branding & commercial design — Well suited for product imagery refinement, marketing visuals, and commercial creative workflows.
Pricing
Pricing is based on three components: the input text tokens in the prompt, a fixed input image fee, and a fixed per-output-image fee.
| SKU | Description | Unit Price |
|---|---|---|
sku_input_1m_token | Price per 1M input (prompt) tokens | $5.00 |
sku_input_image | Fixed fee per input image | $0.008 |
sku_output_image | Fixed fee per generated/edited output image | $0.05 |
sku_base | Combined base cost (input image + output image) | $0.058 |
Pricing Formula
cost = countTokens(prompt) / 1,000,000 × $5.00 + $0.05 + $0.008
Which simplifies to:
cost = countTokens(prompt) / 1,000,000 × $5.00 + $0.058
For most prompts (a few hundred tokens), the token cost is negligible and the effective cost is approximately $0.058 per edit.
Examples
| Prompt Length | Token Count | Token Cost | Base Fee | Total |
|---|---|---|---|---|
| Short (e.g., 50 tokens) | 50 | ~$0.000250 | $0.058 | ~$0.0583 |
| Medium (e.g., 500 tokens) | 500 | ~$0.002500 | $0.058 | ~$0.0605 |
| Long (e.g., 2,000 tokens) | 2,000 | ~$0.010000 | $0.058 | ~$0.0680 |
For detailed pricing configuration, see models/microsoft/mai/price/microsoft-mai-image-2.5-edit.json.
Best Use Cases
- Product retouching — Remove backgrounds, fix lighting, swap colors, or clean up blemishes in product images.
- Marketing asset updates — Update text, logos, or seasonal elements in existing creatives without a full reshoot.
- Photo restoration — Remove artifacts, clean up old photos, or fix quality issues.
- Content localization — Update in-image text across languages for different markets.
- Design iteration — Rapidly test layout, color, or composition variations on an existing design.
- E-commerce — Update product variants (colors, styles) from a single base image.
Pro Tips
- Be explicit about what to change and what to preserve (e.g., "change the red car to blue, keep the background unchanged").
- For text edits, quote the exact text to replace and the replacement string.
- Describe the desired result, not just what to remove — "make the background a white studio" is better than "remove the background".
- For consistent style across edits, keep the same prompt structure across iterations.
Technical Specifications
| Spec | Value |
|---|---|
| Model Developer | Microsoft AI |
| Release Date | June 2, 2026 |
| Input Format | Text prompt + image (JPEG or PNG, multipart/form-data) |
| Output Format | PNG image (base64-encoded) |
| Max Prompt Length | 32,000 tokens |
| Max Output Resolution | 1,048,576 total pixels (e.g., 1024×1024) |
| Supported Languages | English (primary) |
Related Models
- MAI-Image-2.5 Text-to-Image — Same model family for generating images from text only.
- MAI-Image-2.5-Flash Edit — Faster, lower-cost image editing variant.
- MAI-Image-2.5-Flash Text-to-Image — Faster, lower-cost text-to-image variant.

















