Accueil
Explorer
Moonshot LLM Models
moonshotai/Kimi-K2-Instruct
Kimi-K2-Instruct
LLM

Kimi-K2 Instruct API by Moonshot

moonshotai/Kimi-K2-Instruct
Kimi-K2-Instruct

Kimi's latest and most powerful open-source model.

Paramètres

Exemple de code

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("ATLASCLOUD_API_KEY"),
    base_url="https://api.atlascloud.ai/v1"
)

response = client.chat.completions.create(
    model="moonshotai/Kimi-K2-Instruct",
    messages=[
    {
        "role": "user",
        "content": "hello"
    }
],
    max_tokens=1024,
    temperature=0.7
)

print(response.choices[0].message.content)

Installer

Installez le package requis pour votre langage.

bash
pip install requests

Authentification

Toutes les requêtes API nécessitent une authentification via une clé API. Vous pouvez obtenir votre clé API depuis le tableau de bord Atlas Cloud.

bash
export ATLASCLOUD_API_KEY="your-api-key-here"

En-têtes HTTP

python
import os

API_KEY = os.environ.get("ATLASCLOUD_API_KEY")
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_KEY}"
}
Protégez votre clé API

N'exposez jamais votre clé API dans du code côté client ou dans des dépôts publics. Utilisez plutôt des variables d'environnement ou un proxy backend.

Soumettre une requête

import requests

url = "https://api.atlascloud.ai/v1/chat/completions"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer $ATLASCLOUD_API_KEY"
}
data = {
    "model": "your-model",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 1024
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Schema d'entrée

Les paramètres suivants sont acceptés dans le corps de la requête.

Total: 9Requis: 2Optionnel: 7
modelstringrequired
The model ID to use for the completion.
Example: "moonshotai/Kimi-K2-Instruct"
messagesarray[object]required
A list of messages comprising the conversation so far.
rolestringrequired
The role of the message author. One of "system", "user", or "assistant".
systemuserassistant
contentstringrequired
The content of the message.
max_tokensinteger
The maximum number of tokens to generate in the completion.
Default: 1024Min: 1
temperaturenumber
Sampling temperature between 0 and 2. Higher values make output more random, lower values more focused and deterministic.
Default: 0.7Min: 0Max: 2
top_pnumber
Nucleus sampling parameter. The model considers the tokens with top_p probability mass.
Default: 1Min: 0Max: 1
streamboolean
If set to true, partial message deltas will be sent as server-sent events.
Default: false
stoparray[string]
Up to 4 sequences where the API will stop generating further tokens.
frequency_penaltynumber
Penalizes new tokens based on their existing frequency in the text so far. Between -2.0 and 2.0.
Default: 0Min: -2Max: 2
presence_penaltynumber
Penalizes new tokens based on whether they appear in the text so far. Between -2.0 and 2.0.
Default: 0Min: -2Max: 2

Exemple de corps de requête

json
{
  "model": "moonshotai/Kimi-K2-Instruct",
  "messages": [
    {
      "role": "user",
      "content": "Hello"
    }
  ],
  "max_tokens": 1024,
  "temperature": 0.7,
  "stream": false
}

Schema de sortie

L'API renvoie une réponse compatible ChatCompletion.

idstringrequired
Unique identifier for the completion.
objectstringrequired
Object type, always "chat.completion".
Default: "chat.completion"
createdintegerrequired
Unix timestamp of when the completion was created.
modelstringrequired
The model used for the completion.
choicesarray[object]required
List of completion choices.
indexintegerrequired
Index of the choice.
messageobjectrequired
The generated message.
finish_reasonstringrequired
The reason generation stopped.
stoplengthcontent_filter
usageobjectrequired
Token usage statistics.
prompt_tokensintegerrequired
Number of tokens in the prompt.
completion_tokensintegerrequired
Number of tokens in the completion.
total_tokensintegerrequired
Total tokens used.

Exemple de réponse

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "model-name",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  }
}

Atlas Cloud Skills

Atlas Cloud Skills intègre plus de 300 modèles d'IA directement dans votre assistant de codage IA. Une seule commande pour installer, puis utilisez le langage naturel pour générer des images, des vidéos et discuter avec des LLM.

Clients pris en charge

Claude Code
OpenAI Codex
Gemini CLI
Cursor
Windsurf
VS Code
Trae
GitHub Copilot
Cline
Roo Code
Amp
Goose
Replit
40+ clients pris en charge

Installer

bash
npx skills add AtlasCloudAI/atlas-cloud-skills

Configurer la clé API

Obtenez votre clé API depuis le tableau de bord Atlas Cloud et définissez-la comme variable d'environnement.

bash
export ATLASCLOUD_API_KEY="your-api-key-here"

Fonctionnalités

Une fois installé, vous pouvez utiliser le langage naturel dans votre assistant IA pour accéder à tous les modèles Atlas Cloud.

Génération d'imagesGénérez des images avec des modèles comme Nano Banana 2, Z-Image, et plus encore.
Création de vidéosCréez des vidéos à partir de texte ou d'images avec Kling, Vidu, Veo, etc.
Chat LLMDiscutez avec Qwen, DeepSeek et d'autres grands modèles de langage.
Téléchargement de médiasTéléchargez des fichiers locaux pour l'édition d'images et les workflows image-vers-vidéo.

Serveur MCP

Le serveur MCP Atlas Cloud connecte votre IDE avec plus de 300 modèles d'IA via le Model Context Protocol. Compatible avec tout client compatible MCP.

Clients pris en charge

Cursor
VS Code
Windsurf
Claude Code
OpenAI Codex
Gemini CLI
Cline
Roo Code
100+ clients pris en charge

Installer

bash
npx -y atlascloud-mcp

Configuration

Ajoutez la configuration suivante au fichier de paramètres MCP de votre IDE.

json
{
  "mcpServers": {
    "atlascloud": {
      "command": "npx",
      "args": [
        "-y",
        "atlascloud-mcp"
      ],
      "env": {
        "ATLASCLOUD_API_KEY": "your-api-key-here"
      }
    }
  }
}

Outils disponibles

atlas_generate_imageGénérez des images à partir de prompts textuels.
atlas_generate_videoCréez des vidéos à partir de texte ou d'images.
atlas_chatDiscutez avec de grands modèles de langage.
atlas_list_modelsParcourez plus de 300 modèles d'IA disponibles.
atlas_quick_generateCréation de contenu en une étape avec sélection automatique du modèle.
atlas_upload_mediaTéléchargez des fichiers locaux pour les workflows API.

Kimi-K2-Instruct

1. Model Introduction

Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion activated parameters and 1 trillion total parameters. Trained with the Muon optimizer, Kimi K2 achieves exceptional performance across frontier knowledge, reasoning, and coding tasks while being meticulously optimized for agentic capabilities.

Key Features

  • Large-Scale Training: Pre-trained a 1T parameter MoE model on 15.5T tokens with zero training instability.
  • MuonClip Optimizer: We apply the Muon optimizer to an unprecedented scale, and develop novel optimization techniques to resolve instabilities while scaling up.
  • Agentic Intelligence: Specifically designed for tool use, reasoning, and autonomous problem-solving.

Model Variants

  • Kimi-K2-Base: The foundation model, a strong start for researchers and builders who want full control for fine-tuning and custom solutions.
  • Kimi-K2-Instruct: The post-trained model best for drop-in, general-purpose chat and agentic experiences. It is a reflex-grade model without long thinking.

Image 12: Evaluation Results

2. Model Summary

ArchitectureMixture-of-Experts (MoE)
Total Parameters1T
Activated Parameters32B
Number of Layers (Dense layer included)61
Number of Dense Layers1
Attention Hidden Dimension7168
MoE Hidden Dimension (per Expert)2048
Number of Attention Heads64
Number of Experts384
Selected Experts per Token8
Number of Shared Experts1
Vocabulary Size160K
Context Length128K
Attention MechanismMLA
Activation FunctionSwiGLU

3. Evaluation Results

Instruction model evaluation results

BenchmarkMetricKimi K2 InstructDeepSeek-V3-0324Qwen3-235B-A22B (non-thinking)Claude Sonnet 4 (w/o extended thinking)Claude Opus 4 (w/o extended thinking)GPT-4.1Gemini 2.5 Flash Preview (05-20)
Coding Tasks
LiveCodeBench v6 (Aug 24 - May 25)Pass@153.746.937.048.547.444.744.7
OJBenchPass@127.124.011.315.319.619.519.5
MultiPL-EPass@185.783.178.288.689.686.785.6
SWE-bench Verified (Agentless Coding)Single Patch w/o Test (Acc)51.836.639.450.253.040.832.6
SWE-bench Verified (Agentic Coding)Single Attempt (Acc)65.838.834.472.7*72.5*54.6
Multiple Attempts (Acc)71.680.279.4*
SWE-bench Multilingual (Agentic Coding)Single Attempt (Acc)47.325.820.951.031.5
TerminalBenchInhouse Framework (Acc)30.035.543.28.3
Terminus (Acc)25.016.36.630.316.8
Aider-PolyglotAcc60.055.161.856.470.752.444.0
Tool Use Tasks
Tau2 retailAvg@470.669.157.075.081.874.864.3
Tau2 airlineAvg@456.539.026.555.560.054.542.5
Tau2 telecomAvg@465.832.522.145.257.038.616.9
AceBenchAcc76.572.770.576.275.680.174.5
Math & STEM Tasks
AIME 2024Avg@6469.659.4*40.1*43.448.246.561.3
AIME 2025Avg@6449.546.724.7*33.1*33.9*37.046.6
MATH-500Acc97.494.0*91.2*94.094.492.495.4
HMMT 2025Avg@3238.827.511.915.915.919.434.7
CNMO 2024Avg@1674.374.748.660.457.656.675.0
PolyMath-enAvg@465.159.551.952.849.854.049.9
ZebraLogicAcc89.084.037.7*73.759.358.557.9
AutoLogiAcc89.588.983.389.886.188.284.1
GPQA-DiamondAvg@875.168.4*62.9*70.0*74.9*66.368.2
SuperGPQAAcc57.253.750.255.756.550.849.6
Humanity's Last Exam (Text Only)-4.75.25.75.87.13.75.6
General Tasks
MMLUEM89.589.487.091.592.990.490.1
MMLU-ReduxEM92.790.589.293.694.292.490.6
MMLU-ProEM81.181.2*77.383.786.681.879.4
IFEvalPrompt Strict89.881.183.2*87.687.488.084.3
Multi-ChallengeAcc54.131.434.046.849.036.439.5
SimpleQACorrect31.027.713.215.922.842.323.3
LivebenchPass@176.472.467.674.874.669.867.8

• Bold denotes global SOTA, and underlined denotes open-source SOTA.

• Data points marked with * are taken directly from the model's tech report or blog.

• All metrics, except for SWE-bench Verified (Agentless), are evaluated with an 8k output token length. SWE-bench Verified (Agentless) is limited to a 16k output token length.

• Kimi K2 achieves 65.8% pass@1 on the SWE-bench Verified tests with bash/editor tools (single-attempt patches, no test-time compute). It also achieves a 47.3% pass@1 on the SWE-bench Multilingual tests under the same conditions. Additionally, we report results on SWE-bench Verified tests (71.6%) that leverage parallel test-time compute by sampling multiple sequences and selecting the single best via an internal scoring model.

• To ensure the stability of the evaluation, we employed avg@k on the AIME, HMMT, CNMO, PolyMath-en, GPQA-Diamond, EvalPlus, Tau2.

• Some data points have been omitted due to prohibitively expensive evaluation costs.


Base model evaluation results

BenchmarkMetricShotKimi K2 BaseDeepseek-V3-BaseQwen2.5-72BLlama 4 Maverick
General Tasks
MMLUEM5-shot87.887.186.184.9
MMLU-proEM5-shot69.260.662.863.5
MMLU-redux-2.0EM5-shot90.289.587.888.2
SimpleQACorrect5-shot35.326.510.323.7
TriviaQAEM5-shot85.184.176.079.3
GPQA-DiamondAvg@85-shot48.150.540.849.4
SuperGPQAEM5-shot44.739.234.238.8
Coding Tasks
LiveCodeBench v6Pass@11-shot26.322.921.125.1
EvalPlusPass@1-80.365.666.065.5
Mathematics Tasks
MATHEM4-shot70.260.161.063.0
GSM8kEM8-shot92.191.790.486.3
Chinese Tasks
C-EvalEM5-shot92.590.090.980.9
CSimpleQACorrect5-shot77.672.150.553.5

Découvrir des modèles similaires

Commencez avec Plus de 300 Modèles,

Explorer tous les modèles

Join our Discord community

Join the Discord community for the latest model updates, prompts, and support.