
Build video production pipelines on the Shengshu API: Vidu Q3 generates 16-second cinematic clips with native audio and smart camera cuts from a single call.
Generate cinematic, high-fidelity videos from text and images with the latest AI video generation models on Atlas Cloud.
Compare standard vs. our pricing across every ShengShu model.
| Model | Standard Price (USD) | Our Price (USD) | Discount | |
|---|---|---|---|---|
| Vidu Q3-Mix Reference to Video | $0.125 | Start from$0.106/s video | -15% | View |
| Vidu Q3 Reference to Video | $0.05 | Start from$0.042/s video | -15% | View |
| Vidu Q3-Pro Start-end-to-video | $0.05 | Start from$0.042/s video | -15% | View |
| Vidu Q3-Turbo Image-to-video | $0.04 | Start from$0.034/s video | -15% | View |
| Vidu Q3-Turbo Start-end-to-video | $0.04 | Start from$0.034/s video | -15% | View |
| Vidu Q3-Turbo Text-to-video | $0.04 | Start from$0.034/s video | -15% | View |
Instantly explore and experiment with 300+ production-ready models in the Atlas Playground. Start customizing with one click.
Vidu Q3's 16-second clips, native audio, and multi-reference consistency make it practical for workflows that previously required a production team. Teams use the different Q3 tiers to move from fast iteration to finished assets without switching providers.
Studios and indie creators use Vidu Q3-Mix to generate multi-episode animated content where characters look identical across every scene. By uploading character reference sheets, each new clip inherits the same facial features, costume, and visual style without manual frame-by-frame consistency work. Shengshu demonstrated this workflow at SXSW 2026 as the first AI solution for animated series production.
Marketing teams upload a brand character's reference images once and use Vidu Q3 Reference-to-Video to generate dozens of short-form clips for TikTok, Reels, and YouTube Shorts. The character stays visually identical across every output, removing the design bottleneck of briefing and approving each asset individually. At $0.042 per second on Atlas Cloud, a full batch of 10-second clips costs under a dollar each.
E-commerce teams supply product photos from multiple angles as reference inputs and generate cinematic marketing clips that show the product in motion with native ambient audio. The output arrives with synchronized sound in the same call, ready for ads and product pages without a video shoot or audio edit. The start-end frame control lets teams precisely direct how the product is revealed across each clip.
Directors use Vidu Q3-Pro's camera control to generate pre-visualisation clips with specified movements — push-ins on a subject, pans across a set, tracking shots following a character. Native 16-second output means a full short scene can be previsualized in one call. This replaces early-stage storyboard work with motion-accurate reference material for cast and crew.
Development teams use Vidu Q3-Turbo to run batch generation pipelines at $0.034 per second, producing dozens of short clips from text or image inputs per hour. The lower per-second cost makes it practical to generate and test many creative variants before selecting which to scale with Q3-Pro. Both models run under the same Atlas Cloud API key with a single parameter change between tiers.
Tourism boards and travel platforms use Vidu Q3-Pro text-to-video to generate atmospheric destination clips with natural ambient sound from descriptive text prompts. A written scene description of a landscape, landmark, or cultural setting produces a 16-second cinematic clip with matching audio in one call. This provides a cost-effective alternative to location shoots for content that drives booking intent.
Vidu Q3 generates up to 16 seconds of continuous video in a single API call at 1080p and 24fps. This is the longest single-pass generation window among leading video models in its tier. Clip duration is configurable per call within that maximum.
Yes. Vidu Q3 produces dialogue, sound effects, background music, and lip-sync simultaneously with the video frames in a single inference pass. There is no post-production dubbing or manual audio alignment step. Audio timing and on-screen action are synchronized automatically.
You describe camera movement directly in the text prompt — push-ins, pans, tracking shots — and the model executes them from the first frame. No separate parameter or control layer is required. This applies to both text-to-video and image-to-video endpoints on Atlas Cloud.
Vidu Q3-Pro delivers cinematic-grade output with smooth motion and rich detail, priced at $0.042 per second on Atlas Cloud. Vidu Q3-Turbo generates at higher speed with a lower per-second cost of $0.034, suited for drafts and rapid iteration. Both share the same 1080p output resolution and native audio support.
The Vidu Q3 Reference-to-Video endpoint accepts between 1 and 4 reference images per call. You can combine subjects, environments, costumes, and visual styles from different images in a single generation. This is the primary way to maintain character and scene consistency across multiple clips.
Vidu Q3-Mix is the highest-tier reference model in the Vidu Q3 lineup, priced at $0.106 per second on Atlas Cloud. It delivers the strongest multi-subject consistency when combining multiple reference images in one generation. It is designed for workflows like animated series production and branded content where character identity must remain visually identical across many clips.
Yes. Both Vidu Q3-Pro and Q3-Turbo have a Start-end-to-video endpoint on Atlas Cloud. You supply a starting frame image and describe the desired motion or end state, and the model generates the transition. This gives precise directorial control over how each scene opens and closes.
Vidu Q3-Turbo starts at $0.034 per second. Vidu Q3-Pro and the Reference-to-Video endpoint are $0.042 per second. Vidu Q3-Mix, the highest-consistency reference model, is $0.106 per second. All tiers are priced at 15% below standard Shengshu API rates and are available pay-as-you-go.
Join the Discord community for the latest model updates, prompts, and support.