Caching

Caching can reduce latency and cost, but it must be implemented carefully to avoid leaking sensitive content or serving stale results. This guide outlines safe, practical caching patterns for LLM security workflows.

What to Cache (and What Not To)

Good candidates:

deterministic scan results for identical inputs
allowlist and blocklist lookups
policy configurations that change infrequently

Avoid caching:

raw prompts that contain personal data or PHI
full model responses for public or shared systems
anything tied to a specific user session unless scoped carefully

Cache Key Strategy

Use a stable, privacy-safe key:

hash input content instead of storing raw content
include policy version and sensitivity in the key
include account or environment identifiers

Example key format:

scan:{account}:{policyVersion}:{hash(content)}

TTL and Invalidation

use short TTLs for user-generated content
invalidate caches when policies change
avoid long-lived caches for security decisions

Redis for Distributed Caching

If you use Redis, keep it private and secured. Configure it as part of your infrastructure:

redis:
	enabled: true
	url: "redis://localhost:6379/0"

Use network restrictions and TLS where supported by your Redis deployment.

Safety Guidelines

cache only the minimum data needed for the decision
never cache secrets or credentials
do not store PHI or PII in cache values
monitor cache hit rates and eviction behavior

Testing and Monitoring

validate that cached decisions match live results
track cache hit ratio and latency impact
log cache errors without exposing sensitive input

What to Cache (and What Not To)​

Cache Key Strategy​

TTL and Invalidation​

Redis for Distributed Caching​

Safety Guidelines​

Testing and Monitoring​

Related Documentation​