API
Errors
HTTP error codes returned by the OwnLLM API and how to handle each.
OwnLLM follows the OpenAI error envelope so any client that handles OpenAI errors handles ours.
{
"error": {
"type": "<type>",
"code": "<machine-readable-code>",
"message": "<human-readable explanation>"
}
}type is one of invalid_request_error, authentication_error,
permission_error, rate_limit_exceeded, agent_unavailable,
internal_error. code is a more specific machine-readable string
documented below.
400 — invalid_request_error
| Code | When |
|---|---|
model_required | Request is missing model. |
messages_required | Request is missing messages or it's empty. |
model_does_not_support_tools | Request includes tools but the model's capabilities.tools is false. Switch to a tool-capable model. |
model_does_not_support_vision | Request has image parts but the model isn't vision-capable. |
context_length_exceeded | Request token count exceeds the model's context_window. Trim the messages. |
invalid_response_format | response_format.type isn't json_object or json_schema. |
401 — authentication_error
| Code | When |
|---|---|
key_invalid | The API key doesn't exist or has been revoked. |
key_expired | The API key's TTL elapsed. Generate a new one. |
key_malformed | The Authorization header doesn't match Bearer sk-ownllm-.... |
403 — permission_error
| Code | When |
|---|---|
scope_required | The key isn't scoped for the requested model. Edit the key or pick another model. |
tenant_disabled | The tenant is suspended (billing failure, manual hold). |
user_deactivated | The user owning the key was deactivated (manual or SCIM). |
429 — rate_limit_exceeded
| Code | When |
|---|---|
budget_exceeded | The key's monthly budget cap is hit. Bump the budget or wait until next month. |
concurrency_limit | Too many in-flight requests for this key. Back off. |
tenant_qps_limit | The tenant's overall QPS limit is hit (rare; only enforced at heavy abuse). |
The standard Retry-After header is set on 429 responses.
503 — agent_unavailable
| Code | When |
|---|---|
agent_offline | The paired GPU machine isn't reachable. The user should check Atlas / ownllm status. |
tunnel_down | The Cloudflare Tunnel is down even though the agent is reachable. Same fix. |
model_loading | The model isn't ready yet. Retry after a few seconds. |
500 — internal_error
| Code | When |
|---|---|
internal_error | Generic 500. Sentry has the details on our side. Retry; if it persists, open an issue. |
Handling errors in code
try {
const response = await client.chat.completions.create({ ... });
} catch (err) {
if (err.status === 400 && err.code === "model_does_not_support_tools") {
// fall back to a tool-capable model
} else if (err.status === 503) {
// agent is offline; retry with exponential backoff
} else {
throw err;
}
}The OpenAI SDK exposes err.status (HTTP code), err.code (our
machine code), and err.message.
Why custom codes?
Plain OpenAI codes weren't enough to distinguish OwnLLM-specific
failures (offline agent vs scope error vs missing capability).
Adding the code field is additive and doesn't break clients
that only look at the HTTP status.