2020-12-14 00:51:25+00:00

In high-throughput logging dashboards, hundreds of client applications read dashboard metrics continuously. To handle this traffic without overloading the database, developers use a caching layer like Memcache. However, if everyone queries the exact same cache key (e.g., dashboard_summary_data), that key will experience a high traffic bottleneck. This is known as a Hot Key error, which can cause Memcache nodes to bottleneck and application requests to hang.

By partitioning cache keys and using a dynamic probabilistic early expiration model, we can mitigate cache key bottlenecks.


1. Partitioning Keys with Replicas

Instead of mapping a cache value to a single key, we write replicas across multiple keys and load-balance client reads across these replicas:

# cache_balancer.py
import random
from google.appengine.api import memcache

def set_replicated_cache(base_key, value, num_replicas=5, time=300):
    # Write to all replica keys
    for i in range(num_replicas):
        replica_key = f"{base_key}_rep_{i}"
        memcache.set(replica_key, value, time=time)

def get_replicated_cache(base_key, num_replicas=5):
    # Read from a random replica key to distribute request loads
    selected_replica = random.randint(0, num_replicas - 1)
    target_key = f"{base_key}_rep_{selected_replica}"
    return memcache.get(target_key)

2. Mitigating Cache Stampede

To prevent massive databases spikes when cache keys expire, worker threads calculate early expiration limits dynamically using random noise, refreshing cache entries in the background before they drop offline.