Skip to main content
TL;DR — Set model to auto and NinjaChat routes each request to the best model for quality, speed, and cost based on your prompt — with a routing object that shows exactly what happened.

The problem smart routing solves

Picking the right model is hard. You need to know that o3-mini is best for math, claude-sonnet-4.6 for code, gemini-3.1-pro for creative writing, and gemini-3-flash for quick factual answers. Smart routing does this automatically.

The four auto variants

Model IDOptimizes forBest when…
autoQuality + speed balanceYou want the best model without thinking about it
auto-fastLowest latencyReal-time apps, chatbots, low-latency pipelines
auto-cheapLowest costHigh-volume jobs, batch processing, cost-sensitive apps
auto-qualityHighest qualityEnterprise use, critical decisions, best possible output

Request

curl -X POST https://ninjachat.ai/api/v1/chat \
  -H "Authorization: Bearer nj_sk_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Solve: x² + 5x + 6 = 0"}],
    "include_routing": true
  }'

Response

{
  "id": "chatcmpl-1749584400000",
  "object": "chat.completion",
  "model": "o3-mini",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Factoring: x² + 5x + 6 = (x + 2)(x + 3) = 0\nSolutions: x = -2 and x = -3"
    },
    "finish_reason": "stop"
  }],
  "routing": {
    "requested": "auto",
    "resolved": "o3-mini",
    "task_type": "math",
    "reasoning": "Detected task type: math"
  },
  "cost": {"this_request": "$0.006"},
  "balance": "$4.820"
}
The routing field shows exactly which model was chosen and why. Add "include_routing": true to always see this.

How task detection works

NinjaChat analyzes your last 3 user messages using keyword pattern matching:
Task typeDetected keywordsauto routes to
codefunction, debug, implement, algorithm, TypeScript, SQL…claude-sonnet-4.6
mathequation, solve, calculate, integral, probability…o3-mini
creativewrite, story, poem, imagine, fiction, lyrics…gemini-3.1-pro
analysisanalyze, compare, evaluate, research, summarize…gpt-5
quickShort prompts under 80 chars, “what is”, “define”…gemini-3-flash
generalEverything elsegpt-5

Full routing table

TaskModel
codeclaude-sonnet-4.6
matho3-mini
creativegemini-3.1-pro
analysisgpt-5
quickgemini-3-flash
generalgpt-5

Billing

Auto variants are billed at the resolved model’s rate — not a flat fee. If auto routes to o3-mini, you pay $0.006. If it routes to claude-sonnet-4.6, you pay $0.015. The routing field always shows the cost-incurring model.

Parameters

The auto variants accept all standard Chat parameters. Two extra fields are relevant:
ParameterTypeDefaultDescription
include_routingbooleanfalseInclude routing object in response showing resolved model and task type
budget_centsnumberOverride auto selection with a cost ceiling. See Budget Routing.

Code examples

import requests, os

r = requests.post("https://ninjachat.ai/api/v1/chat",
    headers={"Authorization": f"Bearer {os.environ['NINJACHAT_API_KEY']}"},
    json={
        "model": "auto",
        "messages": [{"role": "user", "content": "Write a merge sort in Python"}],
        "include_routing": True,
    }
)
data = r.json()
print(data["choices"][0]["message"]["content"])
print("Routed to:", data["routing"]["resolved"])  # claude-sonnet-4.6
print("Task type:", data["routing"]["task_type"]) # code

Combine with other features

Smart routing works with every other chat feature:
{
  "model": "auto",
  "messages": [...],
  "session_id": "user-123",
  "include_routing": true,
  "include_quality": true,
  "min_quality": 0.8
}
If quality falls below min_quality, the system auto-retries with a better model — on top of smart routing.