OwnLLM Docs
CLI

models

List, install, remove, and configure models on the local agent.

ownllm models list [--json]
ownllm models install <id> [--json]
ownllm models remove <id>
ownllm models config <id>

These commands hit the local Ollama HTTP API directly (127.0.0.1:11434) so they work whether Ollama is being supervised by the OwnLLM agent or runs as a host service (Ollama.app, apt-installed, etc.).

list

ownllm models list
ownllm models list --json

Prints installed models with size, status, and per-model options (keep_alive, thinking, num_parallel).

NAME              STATUS  SIZE   KEEP_ALIVE  THINKING
llama3.3:70b      ready   40 GB  5m          default
qwen2.5-coder:32b ready   19 GB  0           disabled

install

ownllm models install llama3.3:70b
ownllm models install qwen2.5-coder:32b --json

Pulls the model via ollama pull. By default the CLI shows a single progress bar with the merged byte count of every layer; --json streams raw Ollama pull events for scripting.

The site catalog (recipes — see recipes) defines a curated list of models with hardware requirements and default options. You can also install any Ollama tag by passing its name directly.

remove

ownllm models remove llama3.3:70b

Calls ollama rm and cleans up the OwnLLM-side recipe metadata if the model was installed via a recipe.

config

ownllm models config llama3.3:70b

Opens an interactive prompt to set per-model options:

  • keep_alive0 (unload after the request), 5m, 30m, -1 (pin in VRAM). Default 5m.
  • thinkingdefault, disabled, low, medium, high. Only applies to thinking-capable models (Qwen3, DeepSeek-R1).
  • num_parallel — concurrent requests this model accepts before queuing.

Settings are stored in ~/.config/ownllm/models.toml and applied after every request via Ollama's options API.

Available capabilities

/v1/models (the public API) annotates each model with its Ollama capabilities: completion, vision, embedding, tools, thinking. If a request includes tools / tool_choice and the model doesn't have the tools capability, the gateway returns model_does_not_support_tools (HTTP 400) rather than letting the raw Ollama error bubble up.

On this page