Skip to main content

Limits

EndpointRate limit
POST /api/v1/chat60 req/min
POST /api/v1/search60 req/min
POST /api/v1/images30 req/min
POST /api/v1/video2 req/min
GET /api/v1/video/statusUnlimited (free)

Response headers

Every response includes:
HeaderWhat it tells you
X-RateLimit-LimitMax requests in current window
X-RateLimit-RemainingHow many you have left
X-RateLimit-ResetSeconds until window resets
Retry-AfterSeconds to wait (only on 429s)

When you hit a limit

You get a 429 response:
{ "error": "rate_limit_exceeded", "message": "Rate limit exceeded" }

Retry code

import time, requests

def call_with_retry(url, headers, json, max_retries=3):
    for attempt in range(max_retries):
        r = requests.post(url, headers=headers, json=json)
        if r.status_code == 200:
            return r.json()
        if r.status_code == 429:
            wait = int(r.headers.get("Retry-After", 2 ** attempt))
            time.sleep(wait)
            continue
        r.raise_for_status()
    raise Exception("Max retries exceeded")

Tips

  • Space out batch requests — add a small delay between calls instead of firing them all at once
  • Video polling doesn’t count — poll as often as you want (every 10s recommended)
  • Images have a lower limit (30/min) because they’re more expensive to generate
  • Video is 2/min because each generation uses significant compute