Documentation Index Fetch the complete documentation index at: https://docs.ninjachat.ai/llms.txt
Use this file to discover all available pages before exploring further.
Request
curl -X POST https://ninjachat.ai/api/v1/chat \
-H "Authorization: Bearer nj_sk_YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto",
"messages": [{"role": "user", "content": "Solve: x² + 5x + 6 = 0"}],
"include_routing": true
}'
Response
{
"model" : "o3-mini" ,
"choices" : [{
"message" : {
"role" : "assistant" ,
"content" : "Factoring: x² + 5x + 6 = (x + 2)(x + 3) = 0 \n Solutions: x = -2 and x = -3"
},
"finish_reason" : "stop"
}],
"routing" : {
"requested" : "auto" ,
"resolved" : "o3-mini" ,
"task_type" : "math" ,
"reasoning" : "Detected task type: math"
},
"cost" : { "this_request" : "$0.006" }
}
Add "include_routing": true to see which model was chosen and why.
The four variants
Model ID Optimizes for Best when… autoQuality + speed balance You want the best model without thinking about it auto-fastLowest latency Real-time apps, chatbots, low-latency pipelines auto-cheapLowest cost High-volume jobs, batch processing, cost-sensitive apps auto-qualityHighest quality Critical decisions, best possible output
How task detection works
NinjaChat analyzes your last 3 user messages to detect the task type:
Task type Detected keywords auto routes tocodefunction, debug, implement, algorithm, TypeScript, SQL… claude-sonnet-4.6mathequation, solve, calculate, integral, probability… o3-minicreativewrite, story, poem, imagine, fiction, lyrics… gemini-3.1-proanalysisanalyze, compare, evaluate, research, summarize… gpt-5quickShort prompts under 80 chars, “what is”, “define”… gemini-3-flashgeneralEverything else gpt-5
Full routing table
auto (balanced)
auto-fast
auto-cheap
auto-quality
Task Model code claude-sonnet-4.6math o3-minicreative gemini-3.1-proanalysis gpt-5quick gemini-3-flashgeneral gpt-5
Task Model code claude-haiku-4.5math gpt-5-minicreative gemini-3-flashanalysis gpt-5-miniquick gpt-5-minigeneral gpt-5-mini
Task Model code deepseek-v3math qwq-32bcreative gemini-3-flashanalysis deepseek-v3quick gemini-2.5-flashgeneral llama-4-maverick
Task Model code claude-opus-4.6math o3-minicreative gemini-3.1-proanalysis claude-opus-4.6quick claude-opus-4.6general claude-opus-4.6
Billing
Auto variants are billed at the resolved model’s rate . If auto routes to o3-mini, you pay 0.006. I f i t r o u t e s t o ‘ c l a u d e − s o n n e t − 4.6 ‘ , y o u p a y 0.006. If it routes to `claude-sonnet-4.6`, you pay 0.006. I f i t ro u t es t o ‘ c l a u d e − so nn e t − 4.6‘ , yo u p a y 0.015. The routing field always shows the cost-incurring model.
Parameters
Parameter Type Default Description include_routingboolean falseInclude routing object in response. budget_centsnumber — Override with a cost ceiling. See Budget Routing .
Code examples
Python
Node.js
cURL — auto-cheap
import requests, os
r = requests.post( "https://ninjachat.ai/api/v1/chat" ,
headers = { "Authorization" : f "Bearer { os.environ[ 'NINJACHAT_API_KEY' ] } " },
json = {
"model" : "auto" ,
"messages" : [{ "role" : "user" , "content" : "Write a merge sort in Python" }],
"include_routing" : True ,
}
)
data = r.json()
print (data[ "choices" ][ 0 ][ "message" ][ "content" ])
print ( "Routed to:" , data[ "routing" ][ "resolved" ]) # claude-sonnet-4.6
print ( "Task type:" , data[ "routing" ][ "task_type" ]) # code
Manual model selection
If you prefer explicit control over which model runs, here’s a quick reference:
models = {
"classify" : "deepseek-v3" , # $0.003 — simple tasks
"chat" : "gpt-5" , # $0.006 — general use
"code" : "claude-sonnet-4.6" , # $0.015 — when quality matters
"fast" : "gemini-2.5-flash" , # $0.003 — lowest latency
}
See Models for the full list with pricing and recommendations.