Enabling semantic cache
vector-store-backed cache - available on all packs.
Open Dashboard → Cache. Toggle *Semantic cache on* and tune the match strictness (stricter = fewer but safer hits, looser = more hits) and the TTL (default 24h).
What to watch
- Hit rate in Dashboard → Usage - target 15-40% for typical chat apps.
_hiway.cache_hitand_hiway.cache_similarityin your response logs - sanity-check that similar but-different prompts don't collide.- Cache eviction log - when entries expire or are manually purged.
Don't cache personalized responses
Responses that embed user-specific data (account name, balance, personalized recommendation) should not hit cache. Either mark them with cache: false in the request body, or enable PII masking so that the user-specific part doesn't affect similarity.