ホーム
探索
Microsof
MAI Image 2.5 Models
microsoft/mai-image-2.5/text-to-image
MAI-Image-2.5 Text-to-image
テキストから画像

MAI-Image 2.5 Text-to-Image API by MICROSOFT

microsoft/mai-image-2.5/text-to-image
Text-to-image

Microsoft's flagship text-to-image generation model, designed to create high-quality, visually rich images from natural language prompts.

MAI-Image-2.5 Text-to-Image

MAI-Image-2.5 is Microsoft's flagship text-to-image generation model, designed to create high-quality, visually rich images from natural language prompts. It uses a diffusion-based generative approach to progressively refine images, enabling strong alignment between the input text and the generated output. Released on June 2, 2026, it ranks among the top-performing image generation models globally.

Key Capabilities

  • Photorealistic image synthesis — Generates realistic imagery with consistent visual structure, accurate lighting, depth, and texture, suitable for concept visualization and professional content creation.
  • High-fidelity portraits — Produces expressive, natural-looking portraits with accurate facial structure, lighting, and skin texture.
  • Accurate text rendering — Significantly improved rendering of legible text within generated images, including labels, posters, packaging, and signage.
  • Visual reasoning — Reasons across objects, scene structure, lighting, scale, and spatial positioning to produce consistent outputs even from ambiguous or complex prompts.
  • Product, branding & commercial design — Well suited for product imagery, marketing visuals, brand assets, and commercial creative workflows.
  • Creative concept visualization — Translates abstract textual descriptions into visually coherent and imaginative outputs.

Pricing

Pricing is based on two components: the input text tokens in the prompt, and a fixed per-image output fee.

SKUDescriptionUnit Price
sku_input_1m_tokenPrice per 1M input (prompt) tokens$5.00
sku_output_imageFixed fee per generated image$0.05

Pricing Formula

cost = countTokens(prompt) / 1,000,000 × $5.00 + $0.05

For most prompts (a few hundred tokens), the token cost is negligible and the effective cost is approximately $0.05 per image.

Examples

Prompt LengthToken CountToken CostImage FeeTotal
Short (e.g., 50 tokens)50~$0.000250$0.05~$0.0503
Medium (e.g., 500 tokens)500~$0.002500$0.05~$0.0525
Long (e.g., 2,000 tokens)2,000~$0.010000$0.05~$0.0600

For detailed pricing configuration, see models/microsoft/mai/price/microsoft-mai-image-2.5-text-to-image.json.

Best Use Cases

  • Marketing & Advertising — Generate product visuals, campaign imagery, and promotional assets.
  • Creative Content — Concept art, illustrations, book covers, and editorial imagery.
  • E-commerce — Product visualization and lifestyle photography alternatives.
  • Presentations — Custom visuals for slides, reports, and pitch decks.
  • Prototyping — Rapid visual mockups for design and UX workflows.
  • Signage & Packaging — Designs that require legible in-image text rendering.

Pro Tips

  • Be specific about lighting, perspective, and style (e.g., "soft golden-hour lighting", "top-down view", "photorealistic").
  • Mention the subject first, then environment, then style and mood.
  • For text in images, keep inscriptions short and clearly stated in the prompt.
  • Use aspect ratios suited to your use case (portrait for people, landscape for scenes).

Technical Specifications

SpecValue
Model DeveloperMicrosoft AI
Release DateJune 2, 2026
Input FormatText prompt (natural language)
Output FormatPNG image (base64-encoded)
Max Prompt Length32,000 tokens
Max Output Resolution1,048,576 total pixels (e.g., 1024×1024)
Supported LanguagesEnglish (primary)

類似モデルを探索

ひとつのAPIで、あらゆるメディアAIを。

すべてのモデルを探索

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.