Glossary
HiWay-specific terms explained.
Routing
Tier - one of light, standard, heavy. Each tier maps to a set of provider models with similar price/quality.
Routing profile - budget, balanced, or quality. Shifts the selection strategy within a tier: budget picks the cheapest input price, quality picks the highest quality score.
Scoring - the deterministic 7-analyzer pipeline that produces a 0.0-1.0 complexity score for each request, used to pick a tier.
Pinning - bypassing the router by setting model to a fully-qualified model id. Useful for A/B testing or when you want deterministic model selection.
Fallback - automatic retry against the next cheapest same-tier model when an upstream call fails. Max 2 retries.
Billing
BYOK - Bring Your Own Key. You plug your own provider keys into HiWay; providers bill you directly at published rates. HiWay is not a reseller.
Markup - the percentage HiWay charges on your monthly API spend. On Scale: 12.5% (< $500/mo), 11% ($500-$5K), 10% ($5K-$20K). Enterprise: 9% ($20K-$50K), custom negotiated above $50K. Automatically adjusted each month to your actual volume tier.
Wallet - the workspace's USD credit balance with HiWay. Funded via Stripe top-up. Debits correspond to the markup on your monthly API consumption. Configurable alerts (20%, 10%, 0%).
Auto-reload - automatic wallet top-up (USD credits) when balance drops below a configurable threshold. Enabled in Dashboard → Billing.
Savings baseline - the flagship model used to compute the savings number. Currently Claude Opus 4.7. Updated when provider flagship lineups change.
Safety
Guardian - the anti-loop / anti-spike safety system. Five rules: dedup, cost spike, context bloat, zombie agent, rate limit.
Budget Control - monthly cap on upstream BYOK cost. Three verdicts: DOWNGRADE (90%), LIGHT_ONLY (95%), BLOCK (100%).
PII masking - regex-based masking of email / phone / card / IBAN / API keys before embedding, caching, and optionally before the provider call.
Performance
Semantic cache - Qdrant-backed cache that replays responses when a new request lands within cosine similarity ≥ 0.92 of a previous one. Included on Scale and Enterprise.
A/B Experiment - parallel benchmarking of 2-5 candidate models on a sample of your traffic. Included on Scale and Enterprise.