POST /v1/chat/completions

The core routing endpoint.

POST/v1/chat/completionsuser

Route a chat-completion request to the optimal LLM tier.

Request body

Fully OpenAI-compatible. The fields HiWay cares about specifically:

FieldTypeMeaning
modelstringauto (smart routing) or a tier (light, standard, heavy) or a pinned model (claude-sonnet, gpt-4o-mini, …)
messagesarrayStandard chat-completions format, system+user+assistant+tool roles.
max_tokensintUpper bound on output tokens. Directly affects the worst-case cost HiWay reserves before forwarding.
streamboolSSE streaming. Works with every supported provider.
toolsarrayOpenAI tool-use format. Forwarded as-is to compatible providers.

Response headers

  • X-HiWay-Model — provider model actually used.
  • X-HiWay-Tier — routing tier picked.
  • X-HiWay-Cost-USD — cost debited from your balance.
  • X-HiWay-Routing-Ms — time the router took to decide (always under 5 ms).