System Design Core Concepts: Caching

Caching is a fundamental technique in system design used to handle high read traffic by storing frequently accessed data in fast, in-memory storage (like Redis) rather than retrieving it from slower disk-based databases. This dramatically reduces latency and database load. The article outlines that while external caching is the most common form, caching exists at multiple layers including CDNs for static content, client-side storage, and in-process caching for extremely hot data.

The most critical caching pattern for interviews is Cache-Aside (Lazy Loading), where the application checks the cache first and only queries the database on a miss. Other patterns include Write-Through (for strong consistency), Write-Behind (for high write throughput with risk of data loss), and Read-Through. Choosing the right pattern depends on the specific consistency and performance requirements of the system.

Effectively managing a cache involves selecting appropriate eviction policies like LRU (Least Recently Used) and TTL (Time To Live) to balance memory usage and data freshness. The article also highlights common pitfalls such as Cache Stampedes (thundering herds), where expired keys cause traffic spikes to the database, and Hot Keys, which can overload a single cache node. Solutions like request coalescing and local caching are essential strategies to mitigate these risks.

Key Concepts

Cache-Aside: The application manages the cache: reading from it first, and on a miss, reading from the DB and updating the cache. This is the standard pattern for most read-heavy workloads.
Write-Through: The application writes to the cache, which then synchronously writes to the database. This ensures strong consistency but adds latency to write operations.
Eviction Policies: Strategies like LRU (Least Recently Used) remove old data when memory is full, while TTL (Time To Live) ensures data freshness by expiring entries after a set time.
Cache Stampede: A failure mode where many requests simultaneously miss the cache for the same key, overwhelming the database. Solved by request coalescing or probabilistic expiration.
Hot Keys: A single cache key receiving disproportionate traffic that can overload a cache node. Mitigated by using in-process caching or replicating the key across nodes.
CDN (Content Delivery Network): A network of distributed servers that caches content close to the user, significantly reducing latency for static assets and media.