DeepSeek V4 Pro
LLM
PRO

DeepSeek V4 Pro

DeepSeek V4 Pro is a state-of-the-art large language model combining efficient sparse attention, strong reasoning, and integrated agent capabilities for robust long-context understanding and versatile AI applications.

DeepSeek V4 Pro
MiniMax M2.7
GLM 5 Turbo
Kimi K2.5
Qwen3.5 122B A10B
CATEGORY
Series
48 of 73 models
New
NEW
HOT
Doubao Seed 2.0 Pro
LLM
PRO

Doubao Seed 2.0 Pro

Professional-grade model built for advanced workloads, complex analysis, and enterprise AI applications.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.5/M tokens
Output:$3/M tokens
Max Output:131.07K
$0.5/3M in/out
Cache-Based
Gradient-Based
NEW
HOT
Doubao Seed 2.0 Code Preview
LLM
PREVIEW

Doubao Seed 2.0 Code Preview

Developer-focused model specialized in coding agents, repository understanding, and software engineering.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.5/M tokens
Output:$3/M tokens
Max Output:131.07K
$0.5/3M in/out
Cache-Based
Gradient-Based
NEW
HOT
Doubao Seed 2.0 Lite
LLM

Doubao Seed 2.0 Lite

Ultra-efficient model focused on lightweight AI tasks, rapid inference, and large-scale deployment.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.25/M tokens
Output:$2/M tokens
Max Output:131.07K
$0.25/2M in/out
Cache-Based
Gradient-Based
NEW
HOT
Doubao Seed 2.0 Mini
LLM

Doubao Seed 2.0 Mini

Small yet capable model designed for edge scenarios, automation, and cost-sensitive services.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.1/M tokens
Output:$0.4/M tokens
Max Output:131.07K
$0.1/0.4M in/out
Cache-Based
Gradient-Based
NEW
HOT
Doubao Seed 1.8
LLM

Doubao Seed 1.8

Next-generation assistant model with improved instruction following and deeper contextual understanding.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.25/M tokens
Output:$2/M tokens
Max Output:65.54K
$0.25/2M in/out
Cache-Based
Gradient-Based
NEW
HOT
Doubao Seed 1.6 Flash
LLM

Doubao Seed 1.6 Flash

High-speed model engineered for instant responses, real-time interaction, and massive request workloads.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.075/M tokens
Output:$0.3/M tokens
Max Output:32.77K
$0.075/0.3M in/out
Cache-Based
Gradient-Based
NEW
HOT
Doubao Seed 1.6
LLM

Doubao Seed 1.6

Versatile foundation model providing reliable conversation, knowledge understanding, and content creation.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.25/M tokens
Output:$2/M tokens
Max Output:65.54K
$0.25/2M in/out
Cache-Based
Gradient-Based
NEW
HOT
Kimi K2.7 Code
LLM

Kimi K2.7 Code

Powerful coding model for programming, debugging, and AI developer workflows.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.95/M tokens
Output:$4/M tokens
Max Output:262.14K
$0.95/4M in/out
Cache-Based
NEW
HOT
Grok Build 0.1
LLM

Grok Build 0.1

Specialized coding model optimized for software development, code generation, debugging, refactoring, and developer workflows.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$1/M tokens
Output:$2/M tokens
Max Output:262.14K
$1/2M in/out
Cache-Based
Gradient-Based
NEW
HOT
Grok 4.3
LLM

Grok 4.3

Advanced conversational AI model optimized for natural dialogue, knowledge exploration, reasoning, and interactive chat experiences.

1000.0K CONTEXT:
Input type:
Output type:
Context:1000.00K
Input:$1.25/M tokens
Output:$2.5/M tokens
Max Output:1000.00K
$1.25/2.5M in/out
Cache-Based
Gradient-Based
HOT
Claude Opus 4.8
LLM

Claude Opus 4.8

Anthropic's most capable model, built for advanced reasoning, complex workflows, deep analysis, and high-quality content generation.

1000.0K CONTEXT:
Input type:
Output type:
Context:1000.00K
Input:$5/M tokens
Output:$25/M tokens
Max Output:128.00K
$5/25M in/out
Cache-Based
NEW
HOT
Gemini 3.5 Flash
LLM

Gemini 3.5 Flash

Fast and cost-efficient multimodal model designed for high-throughput applications, real-time interactions, and everyday AI tasks.

1048.6K CONTEXT:
Input type:
Output type:
Context:1048.58K
Input:$1.5/M tokens
Output:$9/M tokens
Max Output:65.54K
$1.5/9M in/out
Cache-Based
NEW
HOT
DeepSeek V4 Pro
LLM
PRO

DeepSeek V4 Pro

DeepSeek V4 Pro is a state-of-the-art large language model combining efficient sparse attention, strong reasoning, and integrated agent capabilities for robust long-context understanding and versatile AI applications.

1048.6K CONTEXT:
Input type:
Output type:
Context:1048.58K
Input:$1.68/M tokens
Output:$3.38/M tokens
Max Output:393.22K
$1.68/3.38M in/out
Cache-Based
NEW
HOT
DeepSeek V4 Flash
LLM

DeepSeek V4 Flash

DeepSeek V4 Flash is a state-of-the-art large language model combining efficient sparse attention, strong reasoning, and integrated agent capabilities for robust long-context understanding and versatile AI applications.

1048.6K CONTEXT:
Input type:
Output type:
Context:1048.58K
Input:$0.14/M tokens
Output:$0.28/M tokens
Max Output:393.22K
$0.14/0.28M in/out
Cache-Based
NEW
OWL
LLM

OWL

No description available.

1048.8K CONTEXT:
Input type:
Output type:
Context:1048.76K
Input:Free
Output:Free
Max Output:262.14K
Free
HOT
Kimi K2.6
LLM

Kimi K2.6

Enhanced model for reasoning, coding, and productivity.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.95/M tokens
Output:$4/M tokens
Max Output:262.14K
$0.95/4M in/out
Cache-Based
NEW
Qwen3.6 35B A3B
LLM

Qwen3.6 35B A3B

The latest Qwen reasoning model.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.161/M tokens
Output:$0.965/M tokens
Max Output:65.54K
$0.161/0.965M in/out
NEW
Qwen3.6 Plus
LLM

Qwen3.6 Plus

Versatile model for chat, and productivity workflows.

1000.0K CONTEXT:
Input type:
Output type:
Context:1000.00K
Input:$0.325/M tokens
Output:$1.95/M tokens
Max Output:65.54K
$0.325/1.95M in/out
Cache-Based
Gradient-Based
NEW
HOT
GLM 5.2
LLM

GLM 5.2

Agent-oriented model built for complex reasoning, tool use, and autonomous task execution.

202.8K CONTEXT:
Input type:
Output type:
Context:202.75K
Input:$1.4/M tokens
Output:$4.4/M tokens
Max Output:202.75K
$1.4/4.4M in/out
Cache-Based
NEW
HOT
GLM 5.1
LLM

GLM 5.1

GLM-5.1 is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.

202.8K CONTEXT:
Input type:
Output type:
Context:202.75K
Input:$1.26/M tokens
Output:$3.96/M tokens
Max Output:202.75K
$1.26/3.96M in/out
Cache-Based
NEW
HOT
MiniMax M2.7
LLM

MiniMax M2.7

MiniMax-M2.7 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency.

196.6K CONTEXT:
Input type:
Output type:
Context:196.61K
Input:$0.3/M tokens
Output:$1.2/M tokens
Max Output:196.61K
$0.3/1.2M in/out
Cache-Based
NEW
HOT
MiniMax M3
LLM

MiniMax M3

MiniMax M3 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency.

524.3K CONTEXT:
Input type:
Output type:
Context:524.30K
Input:$0.42/M tokens
Output:$1.68/M tokens
Max Output:524.29K
$0.42/1.68M in/out
Cache-Based
NEW
Qwen3.5 122B A10B
LLM

Qwen3.5 122B A10B

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.3/M tokens
Output:$2.4/M tokens
Max Output:65.54K
$0.3/2.4M in/out
NEW
Qwen3.5 35B A3B
LLM

Qwen3.5 35B A3B

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.225/M tokens
Output:$1.8/M tokens
Max Output:65.54K
$0.225/1.8M in/out
NEW
Qwen3.5 27B
LLM

Qwen3.5 27B

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.27/M tokens
Output:$2.16/M tokens
Max Output:65.54K
$0.27/2.16M in/out
NEW
Qwen3 Coder Next
LLM

Qwen3 Coder Next

Qwen3 Coder represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.18/M tokens
Output:$1.35/M tokens
Max Output:262.14K
$0.18/1.35M in/out
Gradient-Based
NEW
Qwen3.5 397BA17B
LLM

Qwen3.5 397BA17B

Qwen3.5 represents a significant leap forward, integrating breakthroughs in multimodal learning, architectural efficiency, reinforcement learning scale, and global accessibility to empower developers and enterprises with unprecedented capability and efficiency.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.55/M tokens
Output:$3.5/M tokens
Max Output:65.54K
$0.55/3.5M in/out
Cache-Based
HOT
MiniMax M2.5
LLM

MiniMax M2.5

MiniMax-M2.5 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency.

196.6K CONTEXT:
Input type:
Output type:
Context:196.61K
Input:$0.295/M tokens
Output:$1.2/M tokens
Max Output:196.61K
$0.295/1.2M in/out
Cache-Based
NEW
HOT
GLM 5v Turbo
LLM
TURBO

GLM 5v Turbo

GLM-5v Turbo is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.

202.8K CONTEXT:
Input type:
Output type:
Context:202.75K
Input:$1.2/M tokens
Output:$4/M tokens
Max Output:131.07K
$1.2/4M in/out
Cache-Based
NEW
HOT
GLM 5 Turbo
LLM
TURBO

GLM 5 Turbo

GLM-5 Turbo is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$1.2/M tokens
Output:$4/M tokens
Max Output:131.07K
$1.2/4M in/out
Cache-Based
NEW
HOT
GLM 5
LLM

GLM 5

GLM-5 is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.

202.8K CONTEXT:
Input type:
Output type:
Context:202.75K
Input:$0.95/M tokens
Output:$3.15/M tokens
Max Output:202.75K
$0.95/3.15M in/out
Cache-Based
Qwen3 VL 30B A3B Thinking
LLM

Qwen3 VL 30B A3B Thinking

The latest Qwen reasoning model.

128.0K CONTEXT:
Input type:
Output type:
Context:128.00K
Input:$0.15/M tokens
Output:$1.5/M tokens
Max Output:32.00K
$0.15/1.5M in/out
Qwen3 VL 8B Instruct
LLM

Qwen3 VL 8B Instruct

The latest Qwen reasoning model.

128.0K CONTEXT:
Input type:
Output type:
Context:128.00K
Input:$0.08/M tokens
Output:$0.5/M tokens
Max Output:32.00K
$0.08/0.5M in/out
Qwen3 VL 30B A3B Instruct
LLM

Qwen3 VL 30B A3B Instruct

The latest Qwen reasoning model.

128.0K CONTEXT:
Input type:
Output type:
Context:128.00K
Input:$0.15/M tokens
Output:$0.6/M tokens
Max Output:32.00K
$0.15/0.6M in/out
HOT
Kimi K2.5
LLM

Kimi K2.5

Powerful model for long-context and intelligent workflows.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.49/M tokens
Output:$2.5/M tokens
Max Output:262.14K
$0.49/2.5M in/out
Cache-Based
NEW
Qwen3.7 Max
LLM

Qwen3.7 Max

Flagship model for advanced reasoning, coding, and complex tasks.

1000.0K CONTEXT:
Input type:
Output type:
Context:1000.00K
Input:$2.5/M tokens
Output:$7.5/M tokens
Max Output:67.07K
$2.5/7.5M in/out
Cache-Based
NEW
Qwen3.7 Plus
LLM

Qwen3.7 Plus

Balanced model combining strong capability, speed, and efficiency.

1000.0K CONTEXT:
Input type:
Output type:
Context:1000.00K
Input:$0.4/M tokens
Output:$1.6/M tokens
Max Output:67.07K
$0.4/1.6M in/out
Cache-Based
Gradient-Based
NEW
Qwen3.5 Plus
LLM

Qwen3.5 Plus

Efficient model for everyday tasks and AI assistants.

1000.0K CONTEXT:
Input type:
Output type:
Context:1000.00K
Input:$0.4/M tokens
Output:$2.4/M tokens
Max Output:67.07K
$0.4/2.4M in/out
Cache-Based
Gradient-Based
NEW
Qwen3.5 Flash
LLM

Qwen3.5 Flash

Fast model optimized for instant responses and large-scale usage.

1000.0K CONTEXT:
Input type:
Output type:
Context:1000.00K
Input:$0.1/M tokens
Output:$0.4/M tokens
Max Output:67.07K
$0.1/0.4M in/out
NEW
Qwen3 Max 20260123
LLM

Qwen3 Max 20260123

Qwen3-Max is a flagship large language model designed for ultra-long context understanding, powerful reasoning, and high-performance text and code generation, making it well suited for complex, large-scale, and production-grade AI applications.

252.0K CONTEXT:
Input type:
Output type:
Context:252.00K
Input:$1.2/M tokens
Output:$6/M tokens
Max Output:32.00K
$1.2/6M in/out
Gradient-Based
HOT
MiniMax M2.1
LLM

MiniMax M2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency.

196.6K CONTEXT:
Input type:
Output type:
Context:196.61K
Input:$0.29/M tokens
Output:$0.95/M tokens
Max Output:196.61K
$0.29/0.95M in/out
Cache-Based
NEW
HOT
GLM 4.7
LLM

GLM 4.7

GLM-4.7 is Z.AI’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more stable multi-step reasoning/execution. It demonstrates significant improvements in executing complex agent tasks while delivering more natural conversational experiences and superior front-end aesthetics.

202.8K CONTEXT:
Input type:
Output type:
Context:202.75K
Input:$0.52/M tokens
Output:$1.85/M tokens
Max Output:202.75K
$0.52/1.85M in/out
Cache-Based
NEW
HOT
DeepSeek V3.2
LLM

DeepSeek V3.2

DeepSeek V3.2 is a state-of-the-art large language model combining efficient sparse attention, strong reasoning, and integrated agent capabilities for robust long-context understanding and versatile AI applications.

163.8K CONTEXT:
Input type:
Output type:
Context:163.84K
Input:$0.26/M tokens
Output:$0.38/M tokens
Max Output:163.84K
$0.26/0.38M in/out
Cache-Based
NEW
HOT
GPT 5.4
LLM

GPT 5.4

Advanced multimodal model optimized for reasoning, coding, content generation, and complex problem-solving with strong accuracy and reliability.

400.0K CONTEXT:
Input type:
Output type:
Context:400.00K
Input:$2.5/M tokens
Output:$15/M tokens
Max Output:128.00K
$2.5/15M in/out
Cache-Based
Gradient-Based
NEW
HOT
GPT 5.5
LLM

GPT 5.5

Advanced multimodal model optimized for reasoning, coding, content generation, and complex problem-solving with strong accuracy and reliability.

1050.0K CONTEXT:
Input type:
Output type:
Context:1050.00K
Input:$5/M tokens
Output:$30/M tokens
Max Output:128.00K
$5/30M in/out
Cache-Based
Gradient-Based
NEW
HOT
KwaiKAT
LLM
PRO

KAT Coder Pro V2

KAT Coder Pro is KwaiKAT's most advanced agentic coding model in the KAT-Coder series. Designed specifically for agentic coding tasks, it excels in real-world software engineering scenarios, achieving 73.4% solve rate on the SWE-Bench Verified benchmark.

262.1K CONTEXT:
Input type:
Output type:
Context:262.14K
Input:$0.3/M tokens
Output:$1.2/M tokens
Max Output:144.00K
$0.3/1.2M in/out
Cache-Based
NEW
HOT
Gemini 3.1 Pro Preview
LLM
PROPREVIEW

Gemini 3.1 Pro Preview

Preview version of Google's flagship reasoning model, offering enhanced analytical capabilities, long-context understanding, and advanced multimodal performance.

1000.0K CONTEXT:
Input type:
Output type:
Context:1000.00K
Input:$2/M tokens
Output:$12/M tokens
Max Output:64.00K
$2/12M in/out
Cache-Based
Gradient-Based
HOT
MiniMax M2
LLM

MiniMax M2

MiniMax-M2 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency.

196.6K CONTEXT:
Input type:
Output type:
Context:196.61K
Input:$0.255/M tokens
Output:$1/M tokens
Max Output:196.61K
$0.255/1M in/out
Cache-Based

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.