首页
探索
Hunyuan 3D Generation Models
tencent/hunyuan3d-rapid/text-to-3d
Hunyuan 3D Rapid Text-to-3D
文生3D

Hunyuan 3D Rapid Text-to 3D API by TENCENT

tencent/hunyuan3d-rapid/text-to-3d
Text-to-3d

Tencent Hunyuan 3D Rapid (Express) — fast lightweight 3D mesh generation from a text prompt, with optional PBR materials. Outputs GLB/OBJ/USDZ/FBX/STL/MP4.

1. Introduction

Hunyuan 3D is a family of generative AI models from Tencent that produce high-resolution textured 3D meshes from text prompts, single images, multi-view images, or sketches. This README applies to the following API model identifiers:

  • tencent/hunyuan3d-rapid/image-to-3d
  • tencent/hunyuan3d-rapid/text-to-3d
  • tencent/hunyuan3d-pro/image-to-3d
  • tencent/hunyuan3d-pro/text-to-3d

Developed by Tencent AI Lab as part of the broader Hunyuan multimodal model family, Hunyuan 3D is designed to bridge the gap between 2D content creation and production-ready 3D asset generation. The system reached commercial general availability on Tencent Cloud in late November 2025, positioning itself as a state-of-the-art alternative to systems such as Trellis, Direct3D, Shap-E, and TEXTure across both shape and texture generation tasks.

The four API identifiers above map to two underlying generation tiers. The tencent/hunyuan3d-rapid/* variants are distilled, latency-optimized models that complete generations in roughly 2–3 minutes with fixed mid-range polygon budgets and 1K textures, while the tencent/hunyuan3d-pro/* variants invoke the full model with configurable polygon counts (40K–1.5M), up to 4K PBR textures, multi-view image conditioning, and specialized generation modes such as low-poly and sketch-driven synthesis. Within each tier, the image-to-3d and text-to-3d suffixes correspond to mutually exclusive input modalities accepted by the same backend job.


2. Key Features & Innovations

  • Two-Stage Decoupled Pipeline: The system separates geometry from appearance via Hunyuan3D-DiT, a flow-based diffusion transformer that generates the underlying shape, and Hunyuan3D-Paint, a multi-view diffusion model that synthesizes textures (with full PBR support starting from version 2.1). This decoupling allows independent optimization of each stage and supports texture re-painting on existing meshes.

  • Rapid vs. Pro Tiers for Different Workloads: tencent/hunyuan3d-rapid/image-to-3d and tencent/hunyuan3d-rapid/text-to-3d are distilled variants tuned for fast iteration, suitable for prototyping and high-volume pipelines. tencent/hunyuan3d-pro/image-to-3d and tencent/hunyuan3d-pro/text-to-3d expose the full model with four generation modes (standard, low-poly, sketch, multi-view), configurable mesh resolution, and higher-fidelity 4K textures.

  • Multi-Modal Conditioning: The Pro tier accepts text prompts (up to 1024 characters in English or Chinese), single reference images, or multi-view image arrays via the MultiViewImages[] parameter, enabling consistent reconstruction from orthographic or turntable captures. Inputs accept JPG, PNG, or WEBP images between 128–5000 pixels and ≤6 MB.

  • PBR Material Output: From version 2.1 onward, the texture pipeline emits physically based rendering maps (albedo, normal, metallic, roughness), making outputs directly usable in Unreal, Unity, and standard DCC tools without manual material authoring.

  • High-Resolution Geometry: Successive model iterations have scaled volumetric resolution from 512³ to 1024³ (2.5) and 1536³ with a 3.6B-parameter voxel backbone (3.0), with the PolyGen extension producing all-quad topology and 8K textures suited to animation and engine pipelines.

  • Production-Oriented Tooling: The Hunyuan3D Studio companion stack adds polygon optimization, semantic UV unwrapping, automatic rigging, part decomposition, and PBR baking, addressing downstream concerns that pure shape generators typically defer to artists.

  • Broad Format Support: Default output is GLB, with optional export to OBJ, FBX, STL, and USDZ to cover game engines, web viewers, AR, and 3D printing workflows.


3. Model Architecture & Technical Details

The Hunyuan 3D architecture is anchored by two principal generative models operating in sequence:

  • Hunyuan3D-DiT (Shape): A flow-matching diffusion transformer that operates on a learned 3D latent representation. It conditions on text or image embeddings to produce a volumetric shape, which is then meshed at the requested resolution. The 2.0 release used a ~17B-parameter configuration; the 3.0 release scaled the voxel backbone to 3.6B parameters operating at 1536³ resolution.

  • Hunyuan3D-Paint (Texture): A multi-view diffusion model that renders consistent textured views around the generated geometry, which are then back-projected and fused into UV-space texture maps. Version 2.1 introduced full PBR-aware multi-view diffusion, producing albedo, normal, metallic, and roughness channels jointly.

Released model versions trace a steady capability curve: 1.0 (November 2024), 2.0 (January 2025, arXiv 2501.12202), 2.1 (June 2025, fully open-source PBR), 2.5 (April 2025, 1024³ geometry), 3.0 (September 2025, 1536³ geometry), and the PolyGen extension producing quad-dominant topology with 8K textures. The Hunyuan3D Studio paper (2509.12815) describes the post-generation processing pipeline, including auto-rigging and semantic UV layout.

The 1.0, 2.0, 2.1, and Omni variants are released as open-source weights on GitHub and Hugging Face, with the public 2.x line runnable on 24 GB consumer GPUs. The 2.5, 3.0, and PolyGen tiers — which back the production API endpoints — remain closed-source and are accessible only through the Tencent Cloud API.

On the API surface, the four identifiers resolve to two job submission endpoints:

API IdentifierBackend JobTypical LatencyTexturePolygon BudgetDefault Concurrency
tencent/hunyuan3d-rapid/text-to-3dSubmitHunyuanTo3DRapidJob2–3 min1KFixed mid-range1
tencent/hunyuan3d-rapid/image-to-3dSubmitHunyuanTo3DRapidJob2–3 min1KFixed mid-range1
tencent/hunyuan3d-pro/text-to-3dSubmitHunyuanTo3DProJob3–6 minup to 4K PBR40K–1.5M3
tencent/hunyuan3d-pro/image-to-3dSubmitHunyuanTo3DProJob3–6 minup to 4K PBR40K–1.5M3

4. Performance Highlights

Hunyuan 3D has been evaluated against leading open and proprietary 3D generation systems on standard shape and texture benchmarks.

Shape Generation — On ULIP-T, ULIP-I, Uni3D-T, and Uni3D-I evaluation metrics, Hunyuan 3D outperforms Trellis, Direct3D, Craftsman, Michelangelo, and Shap-E for both text- and image-conditioned shape synthesis.

Texture Generation — On CMMD, FID, and CLIP-Score, Hunyuan 3D surpasses TexPainter, SyncMVD, Text2Tex, Paint3D, and TEXTure, achieving a CLIP-Score of 0.809 and producing more semantically aligned, higher-fidelity textures.

AreaCompeting SystemsMetricsResult
Shape (text/image → 3D)Trellis, Direct3D, Craftsman, Michelangelo, Shap-EULIP-T/I, Uni3D-T/IHunyuan 3D leads
Texture (multi-view → UV)TexPainter, SyncMVD, Text2Tex, Paint3D, TEXTureCMMD, FID, CLIP-ScoreHunyuan 3D leads (CLIP-Score 0.809)

Independent third-party evaluations have corroborated these results in domain-specific reconstruction tasks, including a peer-reviewed plant reconstruction study that identified the system as state-of-the-art among publicly available generative 3D models.


5. Use Cases

  • Game Development: Generate game-ready props, characters, and environment assets directly from concept art or prompts, with the Pro tier's auto-rigging and quad-topology features (via PolyGen and Studio tooling) making outputs suitable for skeletal animation and engine import.

  • E-Commerce & Product Visualization: Convert single product photographs into rotatable 3D models for catalog pages, AR try-on, and configurators, leveraging tencent/hunyuan3d-rapid/image-to-3d for high-volume catalog ingestion.

  • AR / VR / XR Content: Produce USDZ- and GLB-formatted assets for immersive experiences, web-based 3D viewers, and spatial computing platforms where PBR materials are required for realistic shading.

  • 3D Printing & Hobbyist Fabrication: Export to STL for direct printing of figurines, collectibles, and custom models — a workflow that has seen public adoption for designs such as Pop Mart Labubu-style figures.

  • Animation, VFX, and Advertising: Rapidly generate background props, set dressing, and stylized characters for pre-visualization and short-form content, with the Pro tier's low-poly and sketch modes serving stylized art directions.

  • Industrial Design & Digital Twins: Bootstrap design exploration from sketches via the Pro tier's sketch mode, and produce mesh assets for robotics simulation and digital twin environments where geometric fidelity matters more than artistic styling.

探索类似模型

一个 API,畅享全模态 AI。

探索全部模型

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.