Redis Cache Architecture: Layered Defense and Optimization

A review of our Redis caching architecture—how we built layered defenses against classic cache failures, and what technical debt remains.

Defenses Already in Place

1. Cache Avalanche

Problem: Mass key expiration causes a traffic spike that overwhelms the database.

Solution (Infrastructure layer): TTL randomization with jitter.

The shared Redis library adds a random offset (e.g., ±10%) to every TTL
This spreads expirations across time, eliminating synchronized spikes

2. Cache Breakdown (Hotspot Invalidation)

Problem: A single hot key expires, and concurrent requests all hit the database simultaneously.

Solution (Business Logic layer): Mutex lock on cache miss.

On cache miss, requests compete for a distributed lock
Lock winner: Queries DB, writes cache, releases lock
Lock losers: Wait or retry until cache is populated

3. Cache Penetration

Problem: Queries for non-existent data always miss cache and hit the database. Common attack vector.

Solution (Business Logic layer): Cache null objects.

When DB returns empty, store a sentinel value (e.g., {"empty": true}) in Redis
Use a short TTL (e.g., 60s) to allow eventual data creation
Subsequent requests get the cached “not found” response

Limitation: Vulnerable to attacks using random keys (e.g., UUIDs). Attackers can flood Redis with infinite {"empty": true} entries, causing memory exhaustion and evicting legitimate hot data.

4. Connection & Transport Security

Connection Pooling:

Reuses TCP connections to avoid handshake overhead

TLS Encryption:

Encrypted transport for cross-network communication

Critical Technical Debt

Blocking Commands: `KEYS` Usage

Problem: The codebase uses KEYS for pattern matching and batch operations.

Risk: Redis is single-threaded. KEYS has O(N) complexity—it scans the entire keyspace. At millions of keys, this can block the Redis main thread for seconds.

During this block:

❌ No reads or writes processed
❌ All dependent services (auth, transactions) timeout
❌ Effectively a self-inflicted DoS

Fix: Replace all KEYS with SCAN.

Command	Complexity	Blocking
`KEYS pattern`	O(N)	Yes
`SCAN cursor MATCH pattern`	O(1) per call	No

SCAN uses cursor-based iteration. It’s more complex to implement but guarantees non-blocking operation.

Action Items

Remove KEYS: Audit and replace all KEYS usage with SCAN
Consolidate patterns: Evaluate moving mutex lock and null caching logic into the shared library to reduce cognitive load on developers
Bloom Filter for penetration defense: Null object caching is insufficient against random-key attacks. Add a Bloom Filter (or Cuckoo Filter) as the first-line defense—reject keys that “definitely don’t exist” before touching Redis