
MAI-Image 2.5 Flash Edit API by MICROSOFT
Microsoft's fast, cost-optimized image-to-image editing model, enabling precise edits to existing images at significantly lower cost than the standard MAI-Image-2.5 Edit.
MAI-Image-2.5-Flash Edit (Image-to-Image)
MAI-Image-2.5-Flash Edit is Microsoft's fast, cost-optimized image-to-image editing model, enabling precise, controllable edits to existing images through natural language instructions at significantly lower cost than the standard MAI-Image-2.5 Edit. It uses the same diffusion-based generative architecture, excelling at surgical, targeted modifications — removing or replacing objects, updating in-image text, correcting artifacts, and adapting layouts — while preserving the original composition and visual identity. Released on June 2, 2026.
Key Capabilities
- Surgical object editing — Make targeted object edits: remove, replace, recolor, or reposition specific elements without affecting the rest of the image.
- Inpainting & outpainting — Fill in missing regions, remove unwanted content, and extend image boundaries seamlessly.
- Text updates in images — Accurately update signage, labels, packaging text, and other embedded text within images.
- Artifact cleanup — Remove motion blur, noise, compression artifacts, and other visual defects.
- Layout adaptation — Adjust composition, crop, or reframe scenes while maintaining subject consistency.
- Face and identity consistency — Preserve facial features and identity across edits and modifications.
- Visual reasoning — Understands spatial relationships, lighting, and scene structure to apply edits that look natural and coherent.
- High-fidelity portraits — Generates or edits expressive, natural-looking portraits with accurate facial structure and lighting.
- Accurate text rendering — Maintains or improves text legibility in labels, posters, and signage.
- Product, branding & commercial design — Well suited for product imagery refinement, marketing visuals, and commercial creative workflows.
Flash vs. Standard
| Feature | MAI-Image-2.5 Edit | MAI-Image-2.5-Flash Edit |
|---|---|---|
| Base cost (input + output) | $0.058 / edit | $0.038 / edit |
| Speed | Standard | Faster |
| Quality | Maximum fidelity | High quality, optimized |
| Best for | Premium production | High-volume, cost-sensitive |
Flash is the recommended choice for high-volume editing workflows, rapid iteration, and scenarios where speed and cost efficiency take priority over absolute maximum fidelity.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Natural language instruction describing the edit to apply (max 32,000 tokens) |
| image | Yes | Input image to edit (JPEG or PNG format, multipart/form-data) |
Size constraint: Total pixel count (width × height) of the output must not exceed 1,048,576. Either dimension may exceed 1024 as long as the total remains within the limit.
Pricing
Pricing is based on three components: the input text tokens in the prompt, a fixed input image fee, and a fixed per-output-image fee.
| SKU | Description | Unit Price |
|---|---|---|
sku_input_1m_token | Price per 1M input (prompt) tokens | $5.00 |
sku_input_image | Fixed fee per input image | $0.008 |
sku_output_image | Fixed fee per generated/edited output image | $0.03 |
sku_base | Combined base cost (input image + output image) | $0.038 |
Pricing Formula
cost = countTokens(prompt) / 1,000,000 × $5.00 + $0.03 + $0.008
Which simplifies to:
cost = countTokens(prompt) / 1,000,000 × $5.00 + $0.038
For most prompts (a few hundred tokens), the token cost is negligible and the effective cost is approximately $0.038 per edit — approximately 34% lower than the standard MAI-Image-2.5 Edit.
Examples
| Prompt Length | Token Count | Token Cost | Base Fee | Total |
|---|---|---|---|---|
| Short (e.g., 50 tokens) | 50 | ~$0.000250 | $0.038 | ~$0.0383 |
| Medium (e.g., 500 tokens) | 500 | ~$0.002500 | $0.038 | ~$0.0405 |
| Long (e.g., 2,000 tokens) | 2,000 | ~$0.010000 | $0.038 | ~$0.0480 |
For detailed pricing configuration, see models/microsoft/mai/price/microsoft-mai-image-2.5-flash-edit.json.
Best Use Cases
- High-volume retouching — Batch processing of product image updates, background removals, or color swaps.
- Rapid iteration — Quickly test multiple edit variations before selecting the final result.
- Real-time editing pipelines — Interactive tools where fast response times are critical.
- Content localization — Update in-image text across languages at scale.
- E-commerce at scale — Update product variants (colors, styles) from a single base image in bulk.
- Development & testing — Test editing prompts and pipelines without incurring full production costs.
Pro Tips
- Be explicit about what to change and what to preserve (e.g., "change the red car to blue, keep the background unchanged").
- For text edits, quote the exact text to replace and the replacement string.
- Describe the desired result, not just what to remove — "make the background a white studio" is better than "remove the background".
- For consistent style across edits, keep the same prompt structure across iterations.
- For maximum quality on critical edits, consider upgrading to MAI-Image-2.5 Edit for final production outputs.
Technical Specifications
| Spec | Value |
|---|---|
| Model Developer | Microsoft AI |
| Release Date | June 2, 2026 |
| Input Format | Text prompt + image (JPEG or PNG, multipart/form-data) |
| Output Format | PNG image (base64-encoded) |
| Max Prompt Length | 32,000 tokens |
| Max Output Resolution | 1,048,576 total pixels (e.g., 1024×1024) |
| Supported Languages | English (primary) |
Related Models
- MAI-Image-2.5-Flash Text-to-Image — Same Flash model family for generating images from text only.
- MAI-Image-2.5 Edit — Full-quality image editing variant.
- MAI-Image-2.5 Text-to-Image — Full-quality text-to-image variant.

















