
Midjourney V8.1 Blend API by MIDJOURNEY
Midjourney V8.1 blends two to five input images into four fused results, with an optional guiding prompt and native 2K HD.
1. Introduction
Midjourney V8.1 Blend (midjourney/v8.1/blend) merges two to five input images into four new images that fuse the concepts, subjects, and styles of the originals. It is Midjourney's tool for combining multiple visual ideas into a single coherent result — for example, fusing a character with an environment, or mixing the palette of one image with the composition of another.
It is part of the Midjourney V8.1 family exposed through this API:
midjourney/v8.1/text-to-image— generate from a text promptmidjourney/v8.1/image-to-image— generate guided by one input imagemidjourney/v8.1/blend— fuse 2–5 images (this model)midjourney/v8.1/style-transfer— restyle an image, preserving compositionmidjourney/v8.1/remove-background— isolate the subject on transparencymidjourney/v8.1/image-to-video— animate an image into a short clip
Midjourney V8.1 is built by Midjourney, Inc., an independent, self-funded San Francisco research lab founded in August 2021 by David Holz. V8.1 is the company's fastest model to date and produces high-aesthetic imagery at native 2K resolution.
2. Key Features & Innovations
- Multi-image fusion: Combine 2–5 images so their concepts and styles merge into a new creative expression.
- Optional guiding prompt: Midjourney's native
/blendcommand is image-only, but this endpoint runs through the image-prompt pathway, so you may add an optional text prompt to nudge the fusion toward a theme. Omit it for a pure visual blend. - Native 2K HD: With
hdenabled, results render directly at 2048px without a separate upscaling pass. - ~4–5× faster generation than earlier Midjourney versions (Midjourney-stated).
- Aesthetic controls:
stylize,chaos, andweirdtune how strongly the house aesthetic, variety, and unconventionality are applied to the blend. - Four results per request: Each task returns a 4-image grid of fused candidates.
3. Parameters & Usage
| Parameter | Description |
|---|---|
images (required) | Array of 2 to 5 input image URLs (or uploads) to blend. Publicly reachable HTTPS URLs. |
prompt | Optional text to steer the blend. Max 1024 characters. (Native Midjourney /blend takes no text; this endpoint accepts one via the image-prompt path.) |
aspect_ratio | Output aspect ratio (e.g. 1:1, 16:9, 9:16). |
hd | Enable native 2K (2048px) generation. |
stylize | Strength of Midjourney's default aesthetic (0–1000). |
chaos | Variation/unpredictability across the four results (0–100). |
weird | Unconventionality of the output (0–3000). |
quality | Detail level; V8.1 supports 1 (default) or 4 (more detail, same price). |
seed | Fixed seed for reproducible results. |
Tips: For the most predictable results, use input images that share the aspect ratio you want in the output. Two or three images tend to blend more legibly than five. Each image contributes both subject and style, so balance your inputs accordingly.
4. Model Architecture & Technical Details
Midjourney V8.1 is a complete from-scratch rewrite of the company's image model, built on a GPU-native PyTorch stack (migrated from the earlier TPU infrastructure). The underlying generative approach is understood to be latent diffusion; Midjourney has not published a technical paper or model card, so the backbone, parameter count, and training data remain undisclosed. The defining methodology is a human-preference (RLHF-style) aesthetic tuning loop combined with per-user personalization. V8.1 was released on midjourney.com on April 30, 2026 and became the default Midjourney model on June 10, 2026.
Blend is a long-standing, model-version-independent Midjourney capability (the 2–5 image limit has been stable since the feature's introduction). The training dataset has never been disclosed and is the subject of active, unresolved copyright litigation — Disney Enterprises, Inc. v. Midjourney, Inc. (No. 2:25-cv-05275, C.D. Cal.), filed June 11, 2025 by a coalition of major studios including Disney, Marvel, Lucasfilm, Twentieth Century, Universal, and DreamWorks. Those infringement claims are allegations in pending litigation and have not been adjudicated.
5. Intended Use & Applications
- Concept fusion: Merge a character, object, and setting into a single unified scene.
- Style mashups: Combine the aesthetic of one image with the subject of another.
- Mood & direction exploration: Blend several references to discover a combined look for a project.
- Product in context: Fuse a product image with an environment or lifestyle reference.
To combine images while keeping precise control over a single reference, use midjourney/v8.1/image-to-image; to merge more than five images, run multiple blends or use image-to-image with a curated reference.













