External app access

OpenAI-compatible API

Tater can expose a local OpenAI-compatible chat API so other apps can use the active Tater model directly or route requests through Hydra and Verbas.

/v1/models /v1/chat/completions Direct mode Hydra mode
Configuration

Enable and authenticate

The API is off until enabled. Requests must include the configured key as a bearer token, X-API-Key, or api_key query parameter.

  • Authorization: Bearer YOUR_KEY is the recommended header.
  • X-API-Key: YOUR_KEY is also accepted.
  • If the API is disabled, Tater returns 404. If no key is configured, it returns 403. Invalid keys return 401.
Modes

Direct or Hydra

The API mode controls whether external calls use the active Base model directly or go through Hydra orchestration.

  • Direct sends the normalized chat messages to the active configured LLM client.
  • Hydra gives the latest user message and history to Hydra, optionally with Verba tool access enabled.
  • The request model aliases tater/base and tater/direct force Direct mode. tater/hydra forces Hydra mode.
Endpoints

The API speaks the common chat completion shape.

GET/v1/models

List available aliases

Returns tater/base, tater/direct, tater/hydra, and configured local/remote model rows for discovery.

POST/v1/chat/completions

Run a chat completion

Accepts OpenAI-style messages and returns an OpenAI-style response with choices, message.content, and usage fields when available.

Model routing

API calls use Tater's configured model layer.

The API advertises provider-qualified model IDs for visibility, but chat completions currently route through the active configured LLM provider selected in Tater settings.

Requested modelEffectRuntime used
tater/base or tater/directForces direct chat mode.The active Base provider in Settings -> Models.
tater/hydraForces Hydra mode.The active configured LLM client behind Hydra.
mlx_lm::repo/modelListed for discovery of configured MLX rows.Current chat route still uses the active configured provider; MLX rows run through MLX Engine when selected as active.
llama_cpp::repo::file.ggufListed for discovery of configured llama.cpp rows.Current chat route still uses the active configured provider; llama.cpp rows run through llama.cpp when selected as active.
hf_transformers::repo/modelListed for discovery of configured Transformers rows.Current chat route still uses the active configured provider; Transformers rows run through Hugging Face Transformers when selected as active.
Request example

Direct chat

curl http://localhost:8501/v1/chat/completions \
  -H "Authorization: Bearer YOUR_TATER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tater/base",
    "messages": [
      {"role": "system", "content": "You are concise."},
      {"role": "user", "content": "What is Tater?"}
    ],
    "temperature": 0.7,
    "max_tokens": 256
  }'
Request example

Hydra with tools

curl http://localhost:8501/v1/chat/completions \
  -H "Authorization: Bearer YOUR_TATER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tater/hydra",
    "user": "external-app",
    "messages": [
      {"role": "user", "content": "Turn on the office lights."}
    ]
  }'
Compatibility notes

Useful details for external clients.

Streaming

Streaming shape

When stream is true, Tater returns a server-sent event stream with a single completion chunk followed by [DONE].

Messages

Text content

Tater normalizes string message content and OpenAI-style text parts. Non-text multimodal parts are ignored by this endpoint.

Tools

Hydra tool access

Hydra mode can expose enabled Verbas to external requests when the API setting allows Hydra tools.