Claude Code

Claude Code accepts an OpenAI-compatible base URL via the OPENAI_BASE_URL and OPENAI_API_KEY environment variables. Set those to your OwnLLM endpoint and you're done.

Setup

export OPENAI_BASE_URL=https://acme-prod.ownllm.app/v1
export OPENAI_API_KEY=sk-ownllm-...

Add those exports to your shell profile (~/.zshrc, ~/.bashrc, ~/.config/fish/config.fish).

To verify:

claude-code --version
echo "ping" | claude-code chat --model llama-3.3-70b

Recommended models

Claude Code's planner uses tool calls aggressively. Pick a tool-capable model:

qwen3:32b — strong reasoning, supports tools, thinking: low recommended for the planner.
llama-3.3-70b — heavier but more capable for complex changes. Needs ~50 GB VRAM/RAM.
deepseek-r1:32b — code-heavy. Make sure it's installed with tools on if your version supports it.

Per-project config

Claude Code reads .claude/config.json per project. To pin a model per project without polluting global env:

{
  "model": "qwen3:32b",
  "providers": {
    "openai": {
      "baseUrl": "https://acme-prod.ownllm.app/v1",
      "apiKey": "${OWNLLM_API_KEY}"
    }
  }
}

The ${OWNLLM_API_KEY} form keeps the secret out of the file.

Limitations

Claude Code supports OpenAI tool calling. OwnLLM forwards it faithfully when the model has tools: true. If you see model_does_not_support_tools, switch models.
Some Claude Code features are tuned for Anthropic's prompt-cache pricing. Those features still work — you just don't get the cache discount on OwnLLM.

Troubleshooting

If Claude Code hangs on a long completion, the agent's gateway may be queueing. Check ownllm status — if queue=2+, raise num_parallel for that model with ownllm models config.

Setup

Recommended models

Per-project config

Limitations

Troubleshooting

On this page