Edge caching strategies that actually work at scale

We run a multi-tenant SaaS platform serving ~50M requests per month. Our origin servers were drowning. Database CPU sat at 80% during peak hours. API p99 latency hit 2.3 seconds.

The fix was not horizontal scaling. It was smarter caching.

The cache hierarchy

Modern edge platforms give you multiple cache layers:

Browser cache: Cache-Control headers
CDN edge: PoP-level caching
Regional cache: Shared across nearby users
Origin cache: Redis, Memcached, or in-memory

We use all four, with different TTLs and invalidation strategies.

Stale-while-revalidate

The most impactful change was aggressive stale-while-revalidate (SWR):

Cache-Control: public, max-age=60, stale-while-revalidate=86400

This means:

First request fetches from origin, caches for 60 seconds
Subsequent requests serve from cache instantly
After 60 seconds, the next request still serves stale data but triggers a background revalidation
Origin has 24 hours to update before cache truly expires

For our product catalog, this reduced origin requests by 89% with zero perceived stale data.

Surrogate key invalidation

The hard part of caching is invalidation. We tag every cached response with surrogate keys:

product:12345
category:electronics
pricing:tier-pro

When a product updates, we purge all responses tagged with product:12345. This invalidates the product page, category listings, search results, and recommendation carousels simultaneously.

Fine-grained API caching

GraphQL makes CDN caching tricky because every query is a POST to /graphql. We solved this by:

Persisted queries: Every unique query gets a SHA256 hash and is served via GET
Automatic query analysis: We extract entity types from the AST and attach surrogate keys
Partial responses: Hot fields (prices, stock) are fetched separately from cold fields (descriptions, images)

Results

Metric	Before	After
Origin requests	48M/month	2.9M/month
API p99 latency	2.3s	89ms
Database CPU	78%	12%
CDN cache hit ratio	34%	97%

When not to cache

Caching is not universal. We never cache:

User-specific dashboards
Real-time financial data
Write operations
Admin panels

The key is understanding your data's consistency requirements and designing cache boundaries accordingly.