Kling Video O3 4K Image-to-Video
image-to-video

Kling Video O3 4K Image-to-Video

Kling Omni Video O3 (4K) Image-to-Video transforms static images into dynamic cinematic videos using MVL technology. Supports first/last frame control and audio generation.

Kling Video O3 4K Image-to-Video
Nano Banana 2 Reference to Image
Seedance 2.0 Reference-to-Video
Seed3D 2.0 Image-to-3D
Wan-2.7 Text-to-video
CATEGORY
Discount Models (135)
Model function
Series
48 of 265 models
New
Seed Audio 1.0
NEW
text-to-speech

Seed Audio 1.0

Doubao‑Audio‑Generate‑1.0 is Doubao Voice’s next‑generation audio‑generation engine. The industry‑first commercial tool creates film‑grade audio with just one prompt. It eliminates cumbersome audio‑engineering work. Creators generate publish‑ready radio dramas, podcasts and branded audio easily, shifting from a simple voice‑generator to an AI audio director. It serves audiobooks, serialized episodes and commercial audio for high‑quality narrative‑driven production.

AUDIO-GENERATION
From
$0.015/K chars
Seedance 2.0 Mini Reference-to-Video
NEW
image-to-video

Seedance 2.0 Mini Reference-to-Video

Lightweight, economical multimodal video generation from reference images, videos, and audio with native audio.

AUDIO
From$0.056/SEC
$0.045/SEC
-20%
Seedance 2.0 Mini Image-to-Video
NEW
image-to-video

Seedance 2.0 Mini Image-to-Video

Lightweight, economical video generation from a first-frame image (and optional last-frame) with native audio.

From$0.056/SEC
$0.045/SEC
-20%
Seedance 2.0 Mini Text-to-Video
NEW
text-to-video

Seedance 2.0 Mini Text-to-Video

Lightweight, economical video generation from text prompts with native audio.

AUDIO
From$0.056/SEC
$0.045/SEC
-20%
HappyHorse-1.1 Text-to-video
NEW
text-to-video

HappyHorse-1.1 Text-to-video

Generates videos from text prompts with HappyHorse 1.1, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

From
$0.14/SEC
HappyHorse-1.1 Image-to-video
NEW
image-to-video

HappyHorse-1.1 Image-to-video

Animates a first-frame image into video with optional prompt guidance, 720P or 1080P output, and durations from 3 to 15 seconds.

From
$0.14/SEC
HappyHorse-1.1 Reference-to-video
NEW
reference-to-video

HappyHorse-1.1 Reference-to-video

Generates videos from one to nine reference images and a text prompt, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

From
$0.14/SEC
Avatar Omni Human 1.5
NEW
HOT
audio-to-video

Avatar Omni Human 1.5

Open and Advanced Large-Scale Video Generative Models.

From$0.12/SEC
$0.06/SEC
-50%
Kling V3.0 Turbo Image-to-Video
NEW
image-to-video
TURBO

Kling V3.0 Turbo Image-to-Video

Kling V3.0 Turbo Image-to-Video transforms static images into dynamic cinematic videos using MVL technology. Supports first/last frame control and audio generation.

From$0.112/SEC
$0.095/SEC
-15%
Kling V3.0 Turbo Text-to-Video
NEW
text-to-video
TURBO

Kling V3.0 Turbo Text-to-Video

Kling V3.0 Turbo Text-to-Video generates dynamic cinematic videos from text prompts using MVL technology. Supports first/last frame control and audio generation.

From$0.112/SEC
$0.095/SEC
-15%
Kling Video O3 4K Image-to-Video
NEW
image-to-video

Kling Video O3 4K Image-to-Video

Kling Omni Video O3 (4K) Image-to-Video transforms static images into dynamic cinematic videos using MVL technology. Supports first/last frame control and audio generation.

From$0.42/SEC
$0.357/SEC
-15%
Kling Video O3 4K Text-to-Video
NEW
text-to-video

Kling Video O3 4K Text-to-Video

Kling Omni Video O3 (4K) is Kuaishou advanced unified multi-modal video model with MVL (Multi-modal Visual Language) technology. Generates high-quality videos from text prompts with natural motion and audio generation support.

From$0.42/SEC
$0.357/SEC
-15%
MAI-Image-2.5-Flash Text-to-image
NEW
text-to-image

MAI-Image-2.5-Flash Text-to-image

Microsoft's fast, cost-optimized text-to-image generation model, creating high-quality images at lower cost using the same diffusion-based architecture as MAI-Image-2.5.

From
$0.03/PIC
MAI-Image-2.5 Edit
NEW
image-to-image

MAI-Image-2.5 Edit

Microsoft's flagship image-to-image editing model, enabling precise, controllable edits to existing images through natural language instructions.

From
$0.058/PIC
MAI-Image-2.5 Text-to-image
NEW
text-to-image

MAI-Image-2.5 Text-to-image

Microsoft's flagship text-to-image generation model, designed to create high-quality, visually rich images from natural language prompts.

From
$0.05/PIC
Youchuan V8.1 Remove Background
NEW
image-to-image

Youchuan V8.1 Remove Background

Youchuan automatically removes the background from an input image, returning one transparent-background result.

From
$0.086/PIC
Youchuan V8.1 Style Transfer
NEW
image-to-image

Youchuan V8.1 Style Transfer

Youchuan retexture changes the artistic style of an input image while preserving its composition, returning four restyled results.

From
$0.129/PIC
Youchuan V8.1 Blend
NEW
image-to-image

Youchuan V8.1 Blend

Youchuan V8.1 blends two to five input images into four fused results, with an optional guiding prompt and native 2K HD.

From
$0.086/PIC
Youchuan V8.1 Image-to-Image
NEW
image-to-image

Youchuan V8.1 Image-to-Image

Youchuan V8.1 re-imagines an input image guided by a text prompt, returning four variations. Supports native 2K HD, style reference, and aspect-ratio / stylize / chaos / weird controls.

From
$0.086/PIC
Seed3D 2.0 Image-to-3D
NEW
image-to-3D

Seed3D 2.0 Image-to-3D

ByteDance Seed3D 2.0 — generates a textured, PBR-shaded 3D model (glb/obj/usd/usdz) from a single input image. Returns a downloadable .zip archive containing the 3D file.

IMAGE-TO-3D
From
$0.353/PIC
Youchuan V8.1 Image-to-Video
NEW
image-to-video

Youchuan V8.1 Image-to-Video

Youchuan V8.1 animates an input image into four 5-second videos at 480p or 720p.

From
$0.086/SEC
Youchuan V8.1 Text-to-Image
NEW
text-to-image

Youchuan V8.1 Text-to-Image

Youchuan V8.1 generates four images from a text prompt, with optional native 2K HD, a style reference, and aspect-ratio / stylize / chaos / weird controls.

From
$0.086/PIC
xAI TTS v1
NEW
text-to-speech

xAI TTS v1

xAI TTS v1 is a high-fidelity text-to-speech model that converts text into natural, expressive speech with sub-second latency, supporting 20 languages and 80+ voices with fine-grained delivery control.

From
$0.015/K chars
Hunyuan 3D Rapid Image-to-3D
NEW
image-to-3D

Hunyuan 3D Rapid Image-to-3D

Tencent Hunyuan 3D Rapid (Express) — fast lightweight 3D mesh generation from a single image, with optional PBR materials. Outputs GLB/OBJ/USDZ/FBX/STL/MP4.

IMAGE-TO-3D
From
$0.02/PIC
Hunyuan 3D Rapid Text-to-3D
NEW
text-to-3D

Hunyuan 3D Rapid Text-to-3D

Tencent Hunyuan 3D Rapid (Express) — fast lightweight 3D mesh generation from a text prompt, with optional PBR materials. Outputs GLB/OBJ/USDZ/FBX/STL/MP4.

TEXT-TO-3D
From
$0.02/PIC
Hunyuan 3D Pro Image-to-3D
NEW
image-to-3D
PRO

Hunyuan 3D Pro Image-to-3D

Tencent Hunyuan 3D Pro (v3.1) — high-quality textured 3D mesh generation from a single image, with optional PBR materials and custom face count. Outputs GLB/OBJ/USDZ/FBX/STL.

IMAGE-TO-3D
From
$0.02/PIC
Hunyuan 3D Pro Text-to-3D
NEW
text-to-3D
PRO

Hunyuan 3D Pro Text-to-3D

Tencent Hunyuan 3D Pro (v3.1) — high-quality textured 3D mesh generation from a text prompt, with optional PBR materials and custom face count. Outputs GLB/OBJ/USDZ/FBX/STL.

TEXT-TO-3D
From
$0.02/PIC
Nano Banana 2 Reference to Image
NEW
image-to-image

Nano Banana 2 Reference to Image

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

From
$0.08/PIC
Nano Banana 2 Reference to Image Developer
NEW
image-to-image
DEV

Nano Banana 2 Reference to Image Developer

Google's advanced AI-powered video-to-image generation model, designed to generate high-quality static images from video clips combined with text instructions.

From$0.08/PIC
$0.04/PIC
-50%
Grok Imagine Video v1.5 Image-to-Video
NEW
image-to-video

Grok Imagine Video v1.5 Image-to-Video

xAI Grok Imagine Video v1.5 animates a starting frame image with natural-language motion prompts at 480p/720p/1080P.

From
$0.08/SEC
Grok Imagine Image Quality Text-to-Image
NEW
text-to-image

Grok Imagine Image Quality Text-to-Image

xAI Grok Imagine generates polished visuals from natural-language prompts at 1K or 2K resolution, with 14 aspect ratios.

From
$0.05/PIC
Grok Imagine Image Quality Edit
NEW
image-to-image

Grok Imagine Image Quality Edit

xAI Grok Imagine edits one or more reference images with natural-language instructions at 1K or 2K resolution. Supports single image and multi-image (<IMAGE_0>, <IMAGE_1>) reference editing.

From
$0.05/PIC
HappyHorse-1.0 Text-to-video
NEW
text-to-video

HappyHorse-1.0 Text-to-video

Generates videos from text prompts with HappyHorse 1.0, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

From
$0.14/SEC
HappyHorse-1.0 Image-to-video
NEW
image-to-video

HappyHorse-1.0 Image-to-video

Animates a first-frame image into video with optional prompt guidance, 720P or 1080P output, and durations from 3 to 15 seconds.

From
$0.14/SEC
HappyHorse-1.0 Reference-to-video
NEW
reference-to-video

HappyHorse-1.0 Reference-to-video

Generates videos from one to nine reference images and a text prompt, supporting 720P or 1080P output, flexible aspect ratios, and durations from 3 to 15 seconds.

From
$0.14/SEC
HappyHorse-1.0 Video-edit
NEW
video-to-video

HappyHorse-1.0 Video-edit

Edits an input video with text instructions and optional reference images, supporting 720P or 1080P output.

From
$0.14/SEC
Openai GPT Image 2 Text-to-Image
NEW
text-to-image

Openai GPT Image 2 Text-to-Image

GPT Image 2 text to image is OpenAI's fast, cost-efficient text-to-image generator powered by GPT-5 guidance. Create photorealistic shots, product renders, concept art, and stylized graphics from natural-language prompts (optionally conditioned with an image). Supports custom aspect ratios, seeds, negative prompts, hex color hints, and style presets. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

From
$0.009/PIC
Openai GPT Image 2 Edit
NEW
image-to-image

Openai GPT Image 2 Edit

GPT Image 2 Edit is OpenAI's image model for precise, natural-language edits. Add/remove objects, swap backgrounds, retouch faces, adjust colors/lighting, edit text/graphics, crop/resize, and apply hex color control. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

From
$0.01/PIC
Baidu ERNIE Image Turbo Text-to-image
NEW
text-to-image
TURBO

Baidu ERNIE Image Turbo Text-to-image

A fast, low-latency version of ERNIE Image by Baidu, optimized for rapid iteration and scalable image generation.Balances speed and quality, ideal for real-time and high-throughput scenarios.

FREE
Free
Seedance 2.0 Text-to-Video
NEW
text-to-video

Seedance 2.0 Text-to-Video

Generate videos from text prompts with native audio and optional web search.

AUDIO
From$0.112/SEC
$0.09/SEC
-20%
Seedance 2.0 Image-to-Video
NEW
image-to-video

Seedance 2.0 Image-to-Video

Generate videos from a first-frame image (and optional last-frame) with native audio.

AUDIO
From$0.112/SEC
$0.09/SEC
-20%
Seedance 2.0 Reference-to-Video
NEW
image-to-video

Seedance 2.0 Reference-to-Video

Multimodal video generation from reference images, videos, and audio. Supports video editing and extension.

AUDIO
From$0.112/SEC
$0.09/SEC
-20%
Seedance 2.0 Fast Text-to-Video
NEW
text-to-video

Seedance 2.0 Fast Text-to-Video

Fast video generation from text prompts with native audio.

AUDIO
From$0.09/SEC
$0.072/SEC
-20%
Seedance 2.0 Fast Image-to-Video
NEW
image-to-video

Seedance 2.0 Fast Image-to-Video

Fast video generation from first-frame image (and optional last-frame) with native audio.

From$0.09/SEC
$0.072/SEC
-20%
Seedance 2.0 Fast Reference-to-Video
NEW
image-to-video

Seedance 2.0 Fast Reference-to-Video

Fast multimodal video generation from reference images, videos, and audio. Supports video editing and extension.

AUDIO
From$0.09/SEC
$0.072/SEC
-20%
Wan-2.7 Text-to-video
NEW
HOT
text-to-video

Wan-2.7 Text-to-video

Generates videos from text prompts with multi-shot narrative, audio generation, and sound-image synchronization.

From
$0.1/SEC
Wan-2.7 Image-to-video
NEW
HOT
image-to-video

Wan-2.7 Image-to-video

Animates images into videos with first-frame, first-and-last-frame, video continuation, and audio-driven modes.

From
$0.1/SEC
Wan-2.7 Reference-to-video
NEW
video-to-video

Wan-2.7 Reference-to-video

Generates character-driven videos from reference images and videos, with multi-subject and voice-cloning support.

From
$0.1/SEC

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.