xai/grok-imagine-video/text-to-video

text-to-video

Grok Imagine Video Text-to-Video API by xAI

xai/grok-imagine-video/text-to-video

Text-to-video

xAI Grok Imagine Video generates short videos (1-15s) from natural-language prompts at 480p or 720p.

INPUT

If a request is blocked due to a violation of the xAI Terms of Service, the associated charges will still be billed to your account.

Prompt *

Duration

Resolution

Aspect Ratio

OUTPUT

Idle

Your generated videos will appear here

Configure your settings and click Run to get started

Your request will cost $0.05 per run. For $10 you can run this model approximately 200 times.

Here's what you can do next:

Seedance 2.0 Kling v3 Vidu Wan2.7

Parameters

Code Example
import requests
import time

# Step 1: Start video generation
generate_url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "xai/grok-imagine-video/text-to-video",  # Required. Model name
    "prompt": "A beautiful sunset over the ocean with gentle waves",  # Required. Natural-language description of the video to generate
    "duration": 8,  # Length of generated video in seconds. (min: 1, max: 15)
    "resolution": "720p",  # Output resolution. options: 480p | 720p
    "aspect_ratio": "16:9",  # Output aspect ratio
}

generate_response = requests.post(generate_url, headers=headers, json=data)
generate_result = generate_response.json()
prediction_id = generate_result["data"]["id"]

# Step 2: Poll for result
poll_url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"

def check_status():
    while True:
        response = requests.get(poll_url, headers={"Authorization": "Bearer $ATLASCLOUD_API_KEY"})
        result = response.json()

        if result["data"]["status"] in ["completed", "succeeded"]:
            print("Generated video:", result["data"]["outputs"][0])
            return result["data"]["outputs"][0]
        elif result["data"]["status"] == "failed":
            raise Exception(result["data"]["error"] or "Generation failed")
        else:
            # Still processing, wait 2 seconds
            time.sleep(2)

video_url = check_status()

Install

Install the required package for your language.

pip install requests

Authentication

All API requests require authentication via an API key. You can get your API key from the Atlas Cloud dashboard.

export ATLASCLOUD_API_KEY="your-api-key-here"

HTTP Headers

import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}

Keep your API key secure

Never expose your API key in client-side code or public repositories. Use environment variables or a backend proxy instead.

Submit a request

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "prompt": "A beautiful landscape"
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Submit a Request

Submit an asynchronous generation request. The API returns a prediction ID that you can use to check the status and retrieve the result.

POST/api/v1/model/generateVideo

Request Body

import requests

url = "https://api.atlascloud.ai/api/v1/model/generateVideo"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}

data = {
    "model": "xai/grok-imagine-video/text-to-video",
    "prompt": "A beautiful sunset over the ocean with gentle waves"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(f"Prediction ID: {result['data']['id']}")
print(f"Status: {result['data']['status']}")

Response

{
  "code": 200,
  "data": {
    "id": "pred_abc123",
    "status": "processing",
    "model": "model-name",
    "created_at": "2025-01-01T00:00:00Z"
  }
}

Check Status

Poll the prediction endpoint to check the current status of your request.

GET/api/v1/model/prediction/{prediction_id}

Polling Example

import requests
import time

prediction_id = "pred_abc123"
url = f"https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

while True:
    response = requests.get(url, headers=headers)
    result = response.json()
    status = result["data"]["status"]
    print(f"Status: {status}")

    if status in ["completed", "succeeded"]:
        output_url = result["data"]["outputs"][0]
        print(f"Output URL: {output_url}")
        break
    elif status == "failed":
        print(f"Error: {result['data'].get('error', 'Unknown')}")
        break

    time.sleep(3)

Status Values

processingThe request is still being processed.

completedGeneration is complete. Outputs are available.

succeededGeneration succeeded. Outputs are available.

failedGeneration failed. Check the error field.

Completed Response

{
  "data": {
    "id": "pred_abc123",
    "status": "completed",
    "outputs": [
      "https://storage.atlascloud.ai/outputs/result.mp4"
    ],
    "metrics": {
      "predict_time": 45.2
    },
    "created_at": "2025-01-01T00:00:00Z",
    "completed_at": "2025-01-01T00:00:10Z"
  }
}

Upload Files

Upload files to Atlas Cloud storage and get a URL you can use in your API requests. Use multipart/form-data to upload.

POST/api/v1/model/uploadMedia

Upload Example

import requests

url = "https://api.atlascloud.ai/api/v1/model/uploadMedia"
headers = { "Authorization": "Bearer $ATLASCLOUD_API_KEY" }

with open("image.png", "rb") as f:
    files = {"file": ("image.png", f, "image/png")}
    response = requests.post(url, headers=headers, files=files)

result = response.json()
download_url = result["data"]["download_url"]
print(f"File URL: {download_url}")

Response

{
  "data": {
    "download_url": "https://storage.atlascloud.ai/uploads/abc123/image.png",
    "file_name": "image.png",
    "content_type": "image/png",
    "size": 1024000
  }
}

Input Schema

The following parameters are accepted in the request body.

Total: 5Required: 2Optional: 3

modelstringrequired

Model name.

Default: "xai/grok-imagine-video/text-to-video"

promptstringrequired

Natural-language description of the video to generate.

durationinteger

Length of generated video in seconds. Range: 1–15.

Default: 8Min: 1Max: 15

resolutionstring

Output resolution.

Default: "720p"

480p720p

aspect_ratiostring

Output aspect ratio.

Default: "16:9"

1:116:99:164:33:43:22:3

Example Request Body

{
  "model": "xai/grok-imagine-video/text-to-video",
  "prompt": "A beautiful landscape",
  "duration": 8,
  "resolution": "720p",
  "aspect_ratio": "16:9"
}

Output Schema

The API returns a prediction response with the generated output URLs.

idstring

Unique identifier for the prediction.

modelstring

Model ID used for the prediction.

statusstring

Status of the task: created, processing, completed, or failed.

outputsarray

Array of URLs to the generated video (empty when status is not completed).

created_atstring

ISO timestamp of when the request was created.

Example Response

{
  "id": "pred_abc123",
  "status": "completed",
  "model": "model-name",
  "outputs": [
    "https://storage.atlascloud.ai/outputs/result.mp4"
  ],
  "metrics": {
    "predict_time": 45.2
  },
  "created_at": "2025-01-01T00:00:00Z",
  "completed_at": "2025-01-01T00:00:10Z"
}

Atlas Cloud Skills

Atlas Cloud Skills integrates 400+ AI models directly into your AI coding assistant. One command to install, then use natural language to generate images, videos, and chat with LLMs.

Supported Clients

Claude Code

OpenAI Codex

Gemini CLI

Cursor

Windsurf

VS Code

Trae

GitHub Copilot

Cline

Roo Code

Amp

Goose

Replit

40+ supported clients

Install

npx skills add AtlasCloudAI/atlas-cloud-skills

Setup API Key

Get your API key from the Atlas Cloud dashboard and set it as an environment variable.

export ATLASCLOUD_API_KEY="your-api-key-here"

Capabilities

Once installed, you can use natural language in your AI assistant to access all Atlas Cloud models.

Image GenerationGenerate images with models like Nano Banana 2, Z-Image, and more.

Video CreationCreate videos from text or images with Kling, Vidu, Veo, etc.

LLM ChatChat with Qwen, DeepSeek, and other large language models.

Media UploadUpload local files for image editing and image-to-video workflows.

Learn more

github.com/AtlasCloudAI/atlas-cloud-skills

MCP Server

Atlas Cloud MCP Server connects your IDE with 400+ AI models via the Model Context Protocol. Works with any MCP-compatible client.

Supported Clients

Cursor

VS Code

Windsurf

Claude Code

OpenAI Codex

Gemini CLI

Cline

Roo Code

100+ supported clients

Install

npx -y atlascloud-mcp

Configuration

Add the following configuration to your IDE's MCP settings file.

{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

Available Tools

atlas_generate_imageGenerate images from text prompts.

atlas_generate_videoCreate videos from text or images.

atlas_chatChat with large language models.

atlas_list_modelsBrowse 400+ available AI models.

atlas_quick_generateOne-step content creation with auto model selection.

atlas_upload_mediaUpload local files for API workflows.

Learn more

github.com/AtlasCloudAI/mcp-server

API Schema

{
  "info": {
    "title": "AtlasCloud API",
    "version": "1.0.0",
    "description": "The AtlasCloud API."
  },
  "paths": {
    "/api/v1/model/generateVideo": {
      "post": {
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "The request status."
          }
        },
        "requestBody": {
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Input"
              }
            }
          },
          "required": true
        }
      },
      "x-api-name": "model_run"
    },
    "/api/v1/model/prediction/{request_id}": {
      "get": {
        "responses": {
          "200": {
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/PredictionResponse"
                }
              }
            },
            "description": "Result of the request."
          }
        },
        "parameters": [
          {
            "in": "path",
            "name": "request_id",
            "schema": {
              "type": "string",
              "description": "Request ID"
            },
            "required": true
          }
        ]
      },
      "x-api-name": "model_result"
    }
  },
  "openapi": "3.0.0",
  "servers": [
    {
      "url": "https://api.atlascloud.ai"
    }
  ],
  "components": {
    "schemas": {
      "Input": {
        "type": "object",
        "required": [
          "model",
          "prompt"
        ],
        "properties": {
          "model": {
            "type": "string",
            "description": "Model name.",
            "default": "xai/grok-imagine-video/text-to-video"
          },
          "prompt": {
            "type": "string",
            "description": "Natural-language description of the video to generate."
          },
          "duration": {
            "type": "integer",
            "default": 8,
            "minimum": 1,
            "maximum": 15,
            "description": "Length of generated video in seconds. Range: 1–15."
          },
          "resolution": {
            "type": "string",
            "default": "720p",
            "enum": [
              "480p",
              "720p"
            ],
            "description": "Output resolution."
          },
          "aspect_ratio": {
            "type": "string",
            "default": "16:9",
            "enum": [
              "1:1",
              "16:9",
              "9:16",
              "4:3",
              "3:4",
              "3:2",
              "2:3"
            ],
            "description": "Output aspect ratio."
          }
        },
        "x-order-properties": [
          "model",
          "prompt",
          "duration",
          "resolution",
          "aspect_ratio"
        ]
      },
      "PredictionResponse": {
        "type": "object",
        "properties": {
          "id": {
            "type": "string",
            "description": "Unique identifier for the prediction."
          },
          "urls": {
            "type": "object",
            "description": "Object containing related API endpoints."
          },
          "model": {
            "type": "string",
            "description": "Model ID used for the prediction."
          },
          "status": {
            "type": "string",
            "description": "Status of the task: created, processing, completed, or failed."
          },
          "outputs": {
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": "Array of URLs to the generated video (empty when status is not completed)."
          },
          "created_at": {
            "type": "string",
            "format": "date-time",
            "description": "ISO timestamp of when the request was created."
          },
          "has_nsfw_contents": {
            "type": "array",
            "items": {
              "type": "boolean"
            },
            "description": "Array of boolean values indicating NSFW detection for each output."
          }
        }
      }
    },
    "securitySchemes": {
      "apiKeyAuth": {
        "in": "header",
        "name": "Authorization",
        "type": "apiKey"
      }
    }
  }
}

LLM-Friendly Prompt Template

# xai/grok-imagine-video/text-to-video

> xAI Grok Imagine Video generates short videos (1-15s) from natural-language prompts at 480p or 720p.


## Overview

- **Submit endpoint (POST)**: `https://api.atlascloud.ai/api/v1/model/generateVideo` — start an async generation; returns a `prediction_id`
- **Poll endpoint (GET)**: `https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}` — poll this until the prediction finishes
- **Model ID**: `xai/grok-imagine-video/text-to-video`


## API Information

This model can be used via our HTTP API or more conveniently via our client libraries.
See the input and output schema below, as well as the usage examples.


### Input Schema

The API accepts the following input parameters:

- **`model`** (`string`, _required_):
  Model name.
  - Default: `"xai/grok-imagine-video/text-to-video"`

- **`prompt`** (`string`, _required_):
  Natural-language description of the video to generate.

- **`duration`** (`integer`, _optional_):
  Length of generated video in seconds. Range: 1–15.
  - Default: `8`
  - Min: 1
  - Max: 15

- **`resolution`** (`string`, _optional_):
  Output resolution.
  - Default: `"720p"`
  - Options: "480p", "720p"

- **`aspect_ratio`** (`string`, _optional_):
  Output aspect ratio.
  - Default: `"16:9"`
  - Options: "1:1", "16:9", "9:16", "4:3", "3:4", "3:2", "2:3"



**Required Parameters Example**:

```json
{
  "model": "xai/grok-imagine-video/text-to-video",
  "prompt": ""
}
```


**Full Example**:

```json
{
  "model": "xai/grok-imagine-video/text-to-video",
  "prompt": "",
  "duration": 8,
  "resolution": "720p",
  "aspect_ratio": "16:9"
}
```


### Output Schema

The API returns the following output format:


- **`id`** (`string`, _optional_):
  Unique identifier for the prediction.

- **`urls`** (`object`, _optional_):
  Object containing related API endpoints.

- **`model`** (`string`, _optional_):
  Model ID used for the prediction.

- **`status`** (`string`, _optional_):
  Status of the task: created, processing, completed, or failed.

- **`outputs`** (`array[string]`, _optional_):
  Array of URLs to the generated video (empty when status is not completed).

- **`created_at`** (`string`, _optional_):
  ISO timestamp of when the request was created.

- **`has_nsfw_contents`** (`array[boolean]`, _optional_):
  Array of boolean values indicating NSFW detection for each output.



**Example Response**:

```json
{
  "id": "",
  "urls": {},
  "model": "",
  "status": "",
  "outputs": [
    ""
  ],
  "created_at": "",
  "has_nsfw_contents": []
}
```


## Usage Examples

### cURL

```bash
# Step 1: Start generation (async)
curl -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "xai/grok-imagine-video/text-to-video",
  "prompt": "",
  "duration": 8,
  "resolution": "720p",
  "aspect_ratio": "16:9"
}'

# Response will contain: {"code": 200, "data": {"id": "prediction_id", "status": "processing"}}

# Step 2: Poll for result (replace {prediction_id} with the id returned above)
curl -X GET "https://api.atlascloud.ai/api/v1/model/prediction/{prediction_id}" \
  -H "Authorization: Bearer $ATLASCLOUD_API_KEY"

# Keep polling until status is "completed", "succeeded" or "failed"
# When completed, outputs will contain the generated content URL(s)
```

## Additional Resources

### Documentation

- [Model Playground](https://www.atlascloud.ai/models/xai/grok-imagine-video/text-to-video)

A vast desert megacity at dusk, monumental brutalist architecture stretching endlessly beneath a hazy amber sky, cinematic atmosphere inspired by Denis Villeneuve’s visual storytelling, a lone figure in a flowing dark cloak walking slowly through colossal corridors illuminated by soft volumetric light, massive spacecraft hovering silently overhead, drifting dust particles, minimal yet powerful composition, slow cinematic camera movement, ultra realistic textures, deep shadows, atmospheric perspective, muted earth-tone color palette, epic scale, emotional isolation, IMAX framing, subtle lens distortion, immersive sci-fi worldbuilding, haunting ambience, masterpiece cinematography, highly detailed, 8K, film grain, dramatic contrast, slow pacing, surreal futuristic realism.

1. Introduction

Grok Imagine Video Text-to-Video is xAI's text-conditioned video generation endpoint within the broader Grok Imagine multimodal generative system. This README applies to the following API model identifier:

xai/grok-imagine-video/text-to-video

Developed by xAI and built atop technology acquired from the Hotshot startup, Grok Imagine is powered by the "Aurora" engine — a unified autoregressive Mixture-of-Experts model that natively interleaves text, image, video, and audio tokens. The text-to-video endpoint converts natural-language prompts into short, audio-synchronized video clips with cinematic motion, ambient sound, music, and dialogue generated in a single forward pass.

Within the field, xai/grok-imagine-video/text-to-video is positioned as a speed-first, social-media-native competitor to diffusion-based systems like OpenAI Sora 2, Google Veo 3.1, and Runway Gen-4.5. Its autoregressive architecture allows generation in roughly 17–30 seconds — a fraction of the time required by comparable diffusion transformers — while ranking at the top of the Artificial Analysis Video Arena leaderboards in early 2026.

2. Key Features & Innovations

Autoregressive Mixture-of-Experts Architecture: Unlike the diffusion transformers used by most competitors, Grok Imagine Video predicts the next token across interleaved streams of text, image, video, and audio. This unified token-prediction design enables a single backbone to serve five conditioning modes (text-to-image, image-edit, text-to-video, image-to-video, video-edit) and dramatically reduces latency relative to iterative denoising pipelines.
Native Synchronized Audio Generation: Music, sound effects, ambient noise, and dialogue with lip-sync are generated in the same autoregressive pass as the visual stream, rather than being dubbed in after the fact. This produces tight audio-visual coherence that is difficult to achieve with separate video and audio models.
High-Throughput, Low-Latency Inference: Typical generations complete in approximately 17–30 seconds — roughly one-half to one-quarter the time of leading diffusion competitors — making the endpoint practical for interactive ideation and high-volume social-content workflows.
Flexible Output Configuration: Supports clip durations up to 10 seconds in consumer products and 15 seconds via API, at 480p or 720p (with 1080p in Pro preview), at 24 fps, across multiple aspect ratios including 16:9 and 9:16. Up to four concurrent video variants can be requested per API call.
Long Prompt Support: API requests accept prompts up to 10,000 characters, allowing detailed shot descriptions, camera-motion directives, style references, and dialogue scripts to be included in a single conditioning string.
Trained at Frontier Scale: The Aurora backbone was trained on xAI's Colossus supercomputer using a cluster reported at 110,000 NVIDIA GB200 GPUs, enabling the large-scale multimodal token training required for native cross-modal coherence.
Unified Endpoint Family: Sharing weights with the image-to-video, video-edit, and image-generation endpoints means consistent style, motion characteristics, and aesthetic fidelity across creative pipelines that mix conditioning modes.

3. Model Architecture & Technical Details

Core Architecture. Grok Imagine Video uses an autoregressive Mixture-of-Experts (MoE) transformer that operates over a tokenized representation of multimodal streams. Text, image patches, video frames, and audio frames are all encoded into a shared token vocabulary, and the model predicts subsequent tokens conditioned on the prompt and any prior generated tokens. Routing through expert subnetworks allows specialization for different modalities and content categories without inflating per-token compute.

Unified Conditioning. The same Aurora backbone exposes five endpoints — text-to-image, image edit, text-to-video, image-to-video, and video edit — distinguished primarily by the conditioning tokens prepended to the generation context. The xai/grok-imagine-video/text-to-video endpoint conditions strictly on a text prompt, giving it the broadest generative freedom of the family (in contrast to the image-to-video endpoint, which is anchored by a reference image up to 20 MB and as many as seven reference subjects).

Training Infrastructure. Training was conducted on the xAI Colossus cluster, reported at 110,000 NVIDIA GB200 GPUs. The Aurora engine was first launched for still images in December 2024 and progressively expanded to motion and audio modalities through 2025, culminating in the Grok Imagine 1.0 GA release on February 2, 2026.

Release Timeline.

Aug 4, 2025 — Initial launch in the Grok iOS app for Premium+ and SuperGrok subscribers.
Oct 5, 2025 — v0.9 introduced Aurora-powered video at expanded short-form lengths.
Jan 28, 2026 — Grok Imagine API publicly launched.
Feb 2, 2026 — Grok Imagine 1.0 GA: 10-second clips, 720p output, improved native audio.
Apr 2026 — Quality and Speed modes added; Pro mode with 1080p teased.

4. Performance Highlights

In early 2026, xai/grok-imagine-video/text-to-video ranked #1 on the Artificial Analysis Video Arena in both the text-to-video and image-to-video categories, outperforming leading offerings from Runway, OpenAI, and Google. Approximately 1.245 billion videos were generated through Grok Imagine in the 30 days preceding the 1.0 GA release.

Rank	Model	Developer	Category	Notable Date
1	`xai/grok-imagine-video/text-to-video`	xAI	Text-to-Video Arena	Q1 2026
2	Sora 2 Pro	OpenAI	Text-to-Video Arena	2025
3	Veo 3.1	Google DeepMind	Text-to-Video Arena	2025
4	Gen-4.5	Runway	Text-to-Video Arena	2025

Qualitative performance characteristics:

Speed: Generation latency of roughly 17–30 seconds per clip — significantly faster than diffusion-based competitors.
Motion & Cinematic Camera Work: Strong results on dynamic camera movement, creature animation, and stylized motion.
Audio Coherence: Native lip-sync and synchronized SFX/music produce more cohesive output than systems that bolt on audio post-hoc.
Known weak areas: Long-form narrative continuity, multi-shot storytelling, photorealistic spoken-dialogue acting, in-frame text rendering, and structured 4K commercial output. Veo 3.1 retains advantages in clip length and 4K; Kling 3.0 leads on multi-shot narrative; Sora 2 leads on dialogue acting.

5. Use Cases

Short-Form Social Video: Optimized for TikTok, Instagram Reels, YouTube Shorts, and X-native video at 9:16 and 16:9. Fast turnaround and 6–10 second durations align naturally with social feed formats.
Ideation, Storyboarding & Previsualization: The low latency makes the endpoint well suited for rapid iteration on concepts, mood pieces, and motion storyboards before committing to expensive production pipelines.
Memes and Cultural Content: High generation speed and stylistic flexibility support meme-format production and reactive cultural content where time-to-publish matters more than fine-grained polish.
Animation, Creature, and Stylized Motion Work: Strong handling of non-photoreal subjects, animated characters, fantastical creatures, and stylized worlds makes it viable for indie animation, game cinematics, and creative experimentation.
Cinematic Shot Generation: Effective at executing dolly, crane, orbit, and tracking-camera prompts, useful for filmmakers exploring shot language and B-roll concepts.
API-Driven Creative Tooling: Through the public API and partner platforms (WaveSpeedAI, Higgsfield, Scenario, fal.ai, Invideo), developers can embed xai/grok-imagine-video/text-to-video into editing suites, marketing platforms, and generative-content products with concurrent multi-variant requests per call.

Grok Imagine Video Text-to-Video API by xAI

INPUT

OUTPUT

Parameters

Code Example

Install

Authentication

HTTP Headers

Submit a request

Submit a Request

Request Body

Response

Check Status

Polling Example

Status Values

Completed Response

Upload Files

Upload Example

Response

Input Schema

Example Request Body

Output Schema

Example Response

Atlas Cloud Skills

Supported Clients

Install

Setup API Key

Capabilities

MCP Server

Supported Clients

Install

Configuration

Available Tools

API Schema

LLM-Friendly Prompt Template

1. Introduction

2. Key Features & Innovations

3. Model Architecture & Technical Details

4. Performance Highlights

5. Use Cases

Explore Similar Models

Grok Imagine Video v1.5 Image-to-Video

Grok Imagine Video Image-to-Video

Grok Imagine Video Reference-to-Video

Grok Imagine Video Extend

Grok Imagine Video Edit

Sync.so Lipsync v3

VEED Lipsync

Seedance 2.0 Mini Reference-to-Video

Seedance 2.0 Mini Image-to-Video

Seedance 2.0 Mini Text-to-Video

HappyHorse-1.1 Text-to-video

HappyHorse-1.1 Image-to-video

HappyHorse-1.1 Reference-to-video

Gemini Omni Flash Reference-to-Video

Gemini Omni Flash Image-to-Video

Gemini Omni Flash Video Edit

One API for All Media AI.

Join our Discord community

INPUT

OUTPUT

Parameters

Code Example

Install

Authentication

HTTP Headers

Submit a request

Submit a Request

Request Body

Response

Check Status

Polling Example

Status Values

Completed Response

Upload Files

Upload Example

Response

Input Schema

Example Request Body

Output Schema