Cache Me If You Can: Design Patterns for Performance


In part 3 of our System Design series, we’re tackling caching and load balancing — the unsung heroes of performance. Without them, systems crumble under scale.

We’ll cover:

  1. Caching – App/DB/CDN; write-through/write-back, TTLs
  2. Cache Invalidation – TTLs, versioning, stampede protection
  3. Load Balancing – L4/L7, round-robin, least-connections, hashing



1. Caching

TL;DR: Caching is your first lever for scale. Use it everywhere, but know the trade-offs.

  • App cache: In-memory (Redis, Memcached). Ultra-fast but volatile.
  • DB cache: Query or object cache to offload hot queries.
  • CDN cache: Push static assets near users.

Strategies:

  • Write-through: Write to cache + DB simultaneously (safe, consistent, slower writes)
  • Write-back: Write to cache first, sync to DB later (fast, risky if cache crashes)
  • TTL (Time To Live): Expire stale data automatically

👉 Example: A news homepage caches top stories for 30s — thousands of requests saved.

👉 Interview tie-in: “How would you scale a read-heavy service?” — caching is the first answer.




2. Cache Invalidation

TL;DR: The hardest part of caching isn’t caching — it’s invalidation.

  • TTL: Safe default, but may serve stale data.
  • Versioning: Change cache key when data updates (e.g., user:v2:123)
  • Stampede protection: Use locking or request coalescing so multiple clients don’t hammer the DB when cache expires.

👉 Example: If 1M users refresh when a cache expires, that’s a cache stampede. Use jittered TTLs or async refresh.

👉 Interview tie-in: They’ll ask “What’s the hardest part about caching?” — answer: invalidation and consistency.




3. Load Balancing

TL;DR: Load balancers spread requests across servers and hide failures.

  • L4 (Transport): Balances based on IP/port. Simple, fast.
  • L7 (Application): Smarter — routes based on headers, cookies, paths.

Algorithms:

  • Round Robin: Even distribution
  • Least Connections: Send to the server with fewest active requests
  • Hashing: Sticky sessions (e.g., same user → same server)

👉 Example: E-commerce app uses L7 LB to route /images → CDN, /checkout → payment cluster.

👉 Interview tie-in: “How do you handle uneven traffic across servers?” — least-connections or weighted load balancing.




✅ Takeaways

  • Cache where it hurts most: hot queries, static assets, read-heavy endpoints
  • Invalidation is the real challenge; plan strategies upfront
  • Load balancing is critical for fairness, resilience, and routing logic

💡 Practice Question:

“Design the caching strategy for a Twitter timeline. How would you avoid cache stampede during trending events?”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *