탐색
Midjourney Models
midjourney/v8.1/image-to-image
Midjourney V8.1 Image-to-Image
이미지를 이미지로

Midjourney V8.1 Image-to-Image API by MIDJOURNEY

midjourney/v8.1/image-to-image
Image-to-image

Midjourney V8.1 re-imagines an input image guided by a text prompt, returning four variations. Supports native 2K HD, style reference, and aspect-ratio / stylize / chaos / weird controls.

1. Introduction

Midjourney V8.1 Image-to-Image (midjourney/v8.1/image-to-image) generates four new images guided by an input image together with a text prompt. Midjourney treats the supplied image as a visual prompt — it reads the image's core elements and uses them as a source of inspiration for new, original results rather than reproducing the input pixel-for-pixel.

It is part of the Midjourney V8.1 family exposed through this API:

  • midjourney/v8.1/text-to-image — generate from a text prompt
  • midjourney/v8.1/image-to-image — generate guided by an input image (this model)
  • midjourney/v8.1/blend — fuse 2–5 images
  • midjourney/v8.1/style-transfer — restyle an image, preserving composition
  • midjourney/v8.1/remove-background — isolate the subject on transparency
  • midjourney/v8.1/image-to-video — animate an image into a short clip

Midjourney V8.1 is built by Midjourney, Inc., an independent, self-funded San Francisco research lab founded in August 2021 by David Holz. V8.1 is the company's fastest model to date and produces high-aesthetic, prompt-faithful imagery at native 2K resolution.


2. Key Features & Innovations

  • Image-guided generation: An input image steers composition, subject, and aesthetic while the text prompt directs the outcome. The image is used as inspiration, not copied exactly — ideal for variations and creative reinterpretation.
  • Image prompts restored in V8.1: Image-prompt conditioning (and Midjourney's internal image-weight handling) were absent in the V8.0 alpha and reinstated in V8.1, returning image-driven workflows to the newest model.
  • Native 2K HD: With hd enabled, V8.1 renders directly at 2048px without a separate upscaling pass.
  • ~4–5× faster generation than earlier Midjourney versions (Midjourney-stated), from the GPU-native PyTorch rewrite.
  • Optional style reference: A separate style-reference image (sref) can be supplied to drive the look (colors, medium, texture, lighting) independently of the content image.
  • Aesthetic controls: stylize, chaos, and weird shape how strongly Midjourney's house aesthetic, variety, and unconventionality are applied.
  • Four results per request: Each task returns a 4-image grid so you can pick the strongest variation.

3. Parameters & Usage

ParameterDescription
image (required)Input image used as the visual prompt. Provide a publicly reachable HTTPS URL or upload.
prompt (required)Text describing the desired result. Max 1024 characters. A text prompt is required alongside the image — an image alone is not a complete prompt.
srefOptional style-reference image URL to drive the visual style separately from the content image.
aspect_ratioOutput aspect ratio (e.g. 1:1, 16:9, 9:16).
hdEnable native 2K (2048px) generation.
stylizeStrength of Midjourney's default aesthetic (0–1000).
chaosVariation/unpredictability across the four results (0–100).
weirdUnconventionality of the output (0–3000).
qualityDetail level; V8.1 supports 1 (default) or 4 (more detail, same price).
seedFixed seed for reproducible results.

Tips: Pair the input image with a clear, specific prompt — the prompt resolves what the image leaves ambiguous. Use sref when you want the style of one image and the content of another. Note that Midjourney's manual image-weight control (--iw) is not exposed by this endpoint; the model applies its default image/text balance.


4. Model Architecture & Technical Details

Midjourney V8.1 is a complete from-scratch rewrite of the company's image model. As part of the V8 program, Midjourney migrated from TPU-based infrastructure to a GPU-native PyTorch stack. The underlying generative approach is understood to be latent diffusion; Midjourney has not published a technical paper or model card, so the backbone, parameter count, text encoder, and training data remain undisclosed. The defining methodology is a human-preference (RLHF-style) aesthetic tuning loop combined with per-user personalization. V8.1 was released on midjourney.com on April 30, 2026 and became the default Midjourney model on June 10, 2026.

The training dataset has never been disclosed and is the subject of active, unresolved copyright litigation — Disney Enterprises, Inc. v. Midjourney, Inc. (No. 2:25-cv-05275, C.D. Cal.), filed June 11, 2025 by a coalition of major studios including Disney, Marvel, Lucasfilm, Twentieth Century, Universal, and DreamWorks. Those infringement claims are allegations in pending litigation and have not been adjudicated.


5. Intended Use & Applications

  • Variations & iteration: Produce alternative takes on an existing image while keeping its general subject and feel.
  • Restyling & reinterpretation: Reimagine a photo, sketch, or render in a new artistic direction guided by the prompt.
  • Concept development: Evolve a reference into polished concept art for games, film, and product design.
  • Marketing & social assets: Generate on-brand variants from a hero image, optionally constrained by a style reference.

For pixel-faithful restyling that preserves the original composition exactly, use midjourney/v8.1/style-transfer; to merge several images, use midjourney/v8.1/blend.

유사한 모델 탐색

하나의 API로 모든 미디어 AI를.

모든 모델 탐색

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.