Response envelope

OpenAI-compatible body + _hiway metadata + X-HiWay-Routed-* headers.

HiWay returns an OpenAI-compatible chat-completions body, enriched with a _hiway metadata object and two custom response headers. Your existing OpenAI-SDK client reads the body fine; the metadata is additive and never breaks standard parsing.

Body example

json
{
  "id":      "chatcmpl-...",
  "object":  "chat.completion",
  "model":   "anthropic/claude-haiku-4-5",
  "choices": [ { "message": { ... } } ],
  "usage":   { "prompt_tokens": 42, "completion_tokens": 18, "total_tokens": 60 },
  "_hiway": {
    "routed_model":    "anthropic/claude-haiku-4-5",
    "routed_tier":     "light",
    "score":           0.18,
    "reason":          "short prompt, no tools, temp=0",
    "fallback":        false,
    "cache_hit":       false,
    "upstream_cost":   0.0004,
    "baseline_cost":   0.0065,
    "savings":         0.0061
  }
}

Headers

HeaderMeaning
X-HiWay-Routed-ModelFully-qualified model id that actually answered (e.g. anthropic/claude-haiku-4-5)
X-HiWay-Routed-TierRouting tier picked: light, standard, heavy, cache, override

Baseline = flagship model

baseline_cost is what the same request would have cost against the current flagship (Claude Opus 4.7 at the time of writing). savings = baseline_cost - upstream_cost. Summed across your workspace, it drives the savings number in Dashboard → Usage.