Qwen3-235B-A22B-Instruct-2507
LLM

Qwen3-235B-A22B-Instruct 2507 API by Alibaba

Qwen/Qwen3-235B-A22B-Instruct-2507
Qwen3-235B-A22B-Instruct-2507

235B-parameter MoE thinking model in Qwen3 series.

Parâmetros

Exemplo de código

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("ATLASCLOUD_API_KEY"),
    base_url="https://api.atlascloud.ai/v1"
)

response = client.chat.completions.create(
    model="Qwen/Qwen3-235B-A22B-Instruct-2507",
    messages=[
    {
        "role": "user",
        "content": "hello"
    }
],
    max_tokens=1024,
    temperature=0.7
)

print(response.choices[0].message.content)

Instalar

Instale o pacote necessário para a sua linguagem de programação.

bash
pip install requests

Autenticação

Todas as solicitações de API requerem autenticação por meio de uma chave de API. Você pode obter sua chave de API no painel do Atlas Cloud.

bash
export ATLASCLOUD_API_KEY="your-api-key-here"

Cabeçalhos HTTP

python
import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}
Mantenha sua chave de API segura

Nunca exponha sua chave de API em código do lado do cliente ou repositórios públicos. Use variáveis de ambiente ou um proxy de backend.

Enviar uma solicitação

import requests

url = "https://api.atlascloud.ai/v1/chat/completions"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 1024
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Schema de entrada

Os seguintes parâmetros são aceitos no corpo da solicitação.

Total: 9Obrigatório: 2Opcional: 7
modelstringrequired
The model ID to use for the completion.
Example: "Qwen/Qwen3-235B-A22B-Instruct-2507"
messagesarray[object]required
A list of messages comprising the conversation so far.
rolestringrequired
The role of the message author. One of "system", "user", or "assistant".
systemuserassistant
contentstringrequired
The content of the message.
max_tokensinteger
The maximum number of tokens to generate in the completion.
Default: 1024Min: 1
temperaturenumber
Sampling temperature between 0 and 2. Higher values make output more random, lower values more focused and deterministic.
Default: 0.7Min: 0Max: 2
top_pnumber
Nucleus sampling parameter. The model considers the tokens with top_p probability mass.
Default: 1Min: 0Max: 1
streamboolean
If set to true, partial message deltas will be sent as server-sent events.
Default: false
stoparray[string]
Up to 4 sequences where the API will stop generating further tokens.
frequency_penaltynumber
Penalizes new tokens based on their existing frequency in the text so far. Between -2.0 and 2.0.
Default: 0Min: -2Max: 2
presence_penaltynumber
Penalizes new tokens based on whether they appear in the text so far. Between -2.0 and 2.0.
Default: 0Min: -2Max: 2

Exemplo de corpo da solicitação

json
{
  "model": "Qwen/Qwen3-235B-A22B-Instruct-2507",
  "messages": [
    {
      "role": "user",
      "content": "Hello"
    }
  ],
  "max_tokens": 1024,
  "temperature": 0.7,
  "stream": false
}

Schema de saída

A API retorna uma resposta compatível com ChatCompletion.

idstringrequired
Unique identifier for the completion.
objectstringrequired
Object type, always "chat.completion".
Default: "chat.completion"
createdintegerrequired
Unix timestamp of when the completion was created.
modelstringrequired
The model used for the completion.
choicesarray[object]required
List of completion choices.
indexintegerrequired
Index of the choice.
messageobjectrequired
The generated message.
finish_reasonstringrequired
The reason generation stopped.
stoplengthcontent_filter
usageobjectrequired
Token usage statistics.
prompt_tokensintegerrequired
Number of tokens in the prompt.
completion_tokensintegerrequired
Number of tokens in the completion.
total_tokensintegerrequired
Total tokens used.

Exemplo de resposta

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "model-name",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  }
}

Atlas Cloud Skills

O Atlas Cloud Skills integra mais de 300 modelos de IA diretamente no seu assistente de codificação com IA. Um comando para instalar e depois use linguagem natural para gerar imagens, vídeos e conversar com LLM.

Clientes compatíveis

Claude Code
OpenAI Codex
Gemini CLI
Cursor
Windsurf
VS Code
Trae
GitHub Copilot
Cline
Roo Code
Amp
Goose
Replit
40+ clientes compatíveis

Instalar

bash
npx skills add AtlasCloudAI/atlas-cloud-skills

Configurar chave de API

Obtenha sua chave de API no painel do Atlas Cloud e defina-a como variável de ambiente.

bash
export ATLASCLOUD_API_KEY="your-api-key-here"

Funcionalidades

Após a instalação, você pode usar linguagem natural no seu assistente de IA para acessar todos os modelos do Atlas Cloud.

Geração de imagensGere imagens com modelos como Nano Banana 2, Z-Image e mais.
Criação de vídeosCrie vídeos a partir de texto ou imagens com Kling, Vidu, Veo, etc.
Chat com LLMConverse com Qwen, DeepSeek e outros modelos de linguagem de grande escala.
Upload de mídiaEnvie arquivos locais para fluxos de trabalho de edição de imagens e imagem para vídeo.

MCP Server

O Atlas Cloud MCP Server conecta seu IDE com mais de 300 modelos de IA através do Model Context Protocol. Funciona com qualquer cliente compatível com MCP.

Clientes compatíveis

Cursor
VS Code
Windsurf
Claude Code
OpenAI Codex
Gemini CLI
Cline
Roo Code
100+ clientes compatíveis

Instalar

bash
npx -y atlascloud-mcp

Configuração

Adicione a seguinte configuração ao arquivo de configuração de MCP do seu IDE.

json
{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

Ferramentas disponíveis

atlas_generate_imageGere imagens a partir de prompts de texto.
atlas_generate_videoCrie vídeos a partir de texto ou imagens.
atlas_chatConverse com modelos de linguagem de grande escala.
atlas_list_modelsExplore mais de 300 modelos de IA disponíveis.
atlas_quick_generateCriação de conteúdo em uma etapa com seleção automática de modelo.
atlas_upload_mediaEnvie arquivos locais para fluxos de trabalho de API.

Qwen3-235B-A22B

Advanced multilingual AI with 128K-token context, excelling in coding, reasoning, and enterprise applications.

Qwen 3 Model Description

Qwen3-235B-A22B, developed by Alibaba Cloud, is a flagship large language model leveraging a Mixture-of-Experts (MoE) architecture. With 235 billion total parameters and 22 billion active per inference, it delivers top-tier performance in coding, math, and reasoning across 119 languages. Optimized for enterprise tasks like software development and research, it’s accessible via AI/ML API.

Technical Specifications

Performance Benchmarks

Qwen3-235B-A22B uses a Transformer-based MoE architecture, activating 22 billion of its 235 billion parameters per token via top-8 expert selection, reducing compute costs. It features Rotary Positional Embeddings and Group-Query Attention for efficiency. Pre-trained on 36 trillion tokens across 119 languages, it uses RLHF and a four-stage post-training process for hybrid reasoning.

  • Context Window: 32K tokens natively, extendable to 128K with YaRN.

  • Benchmarks:

    • Outperforms OpenAI’s o3-mini on AIME (math) and Codeforces (coding).
    • Surpasses Gemini 2.5 Pro on BFCL (reasoning) and LiveCodeBench.
    • MMLU score: 0.828, competitive with DeepSeek R1.
  • Performance: 40.1 tokens/second output speed, 0.54s latency (TTFT).

  • API Pricing:

    • Input tokens: $0.21 per million tokens
    • Output tokens: $0.63 per million tokens
    • Cost for 1,000 tokens: 0.00021(input)+0.00021 (input) + 0.00063 (output) = $0.00084 total

Performance Metrics

Image 64

Qwen3-235B-A22B comparison

Key Capabilities

Qwen3-235B-A22B excels in hybrid reasoning, toggling between thinking mode (/think) for step-by-step problem-solving and non-thinking mode (/no_think) for rapid responses. It supports 119 languages, enabling seamless global applications like multilingual chatbots and translation. With a 128K-token context, it processes large datasets, codebases, and documents with high coherence, using XML delimiters for structure retention.

  • Coding Excellence: Outperforms OpenAI’s o1 on LiveCodeBench, supporting 40+ languages (Python, Java, Haskell, etc.). Generates, debugs, and refactors complex codebases with precision.
  • Advanced Reasoning: Surpasses o3-mini on AIME for math and BFCL for logical reasoning, ideal for intricate problem-solving.
  • Multilingual Proficiency: Natively handles 119 languages, powering cross-lingual tasks like semantic analysis and translation.
  • Enterprise Applications: Drives biomedical literature parsing, financial risk modeling, e-commerce intent prediction, and legal document analysis.
  • Agentic Workflows: Supports tool-calling, Model Context Protocol (MCP), and function calling for autonomous AI agents.
  • API Features: Offers streaming, OpenAI-API compatibility, and structured output generation for real-time integration.

Optimal Use Cases

Qwen3-235B-A22B is tailored for high-complexity enterprise scenarios requiring deep reasoning and scalability:

  • Software Development: Autonomous code generation, debugging, and refactoring for large-scale projects, with superior performance on Codeforces and LiveCodeBench.
  • Biomedical Research: Parsing dense medical literature, structuring clinical notes, and generating patient dialogues with high accuracy.
  • Financial Modeling: Risk analysis, regulatory query answering, and financial document summarization with precise numerical reasoning.
  • Multilingual E-commerce: Semantic product categorization, user intent prediction, and multilingual chatbot deployment across 119 languages.
  • Legal Analysis: Multi-document review for regulatory compliance and legal research, leveraging 128K-token context for coherence.

Comparison with Other Models

Qwen3-235B-A22B stands out among leading models due to its MoE efficiency and multilingual capabilities:

  • vs. OpenAI’s o3-mini: Outperforms in math (AIME) and coding (Codeforces), with lower latency (0.54s TTFT vs. 0.7s). Offers broader language support (119 vs. ~20 languages).
  • vs. Google’s Gemini 2.5 Pro: Excels in reasoning (BFCL) and coding (LiveCodeBench), with a larger context window (128K vs. 96K tokens) and more efficient inference via MoE.
  • vs. DeepSeek R1: Matches MMLU performance (0.828) but surpasses in multilingual tasks and enterprise scalability, with cheaper API pricing.
  • vs. GPT-4.1: Competitive in coding and reasoning, with lower costs and native 119-language support, unlike GPT-4.1’s English focus.

Mais de 300 Modelos, Comece Agora,

Explorar Todos os Modelos

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.