Effective caching is fundamental to scaling web applications. This article explores the strategic differences between application-level and CDN-level caching, with practical guidance on what to cache where and why.

Application-Level vs CDN-Level Caching: Strategic Implementation Guide

When scaling web applications, caching is often one of the first optimization strategies implemented. However, determining what to cache and where to cache it is a nuanced decision that can significantly impact application performance, architecture, and even correctness. This guide dives into the differences between application-level and CDN-level caching, with practical recommendations based on real-world experience.

Understanding the Caching Hierarchy

Caching can occur at multiple levels in your application stack:

Browser Cache - The closest to the user
CDN Cache - Edge network, geographically distributed
API Gateway Cache - Front door to your services
Application Cache - In-memory or distributed caches
Database Cache - Query and result caching

For this article, we’ll focus primarily on the distinction between CDN-level caching (level 2) and application-level caching (level 4), as these represent the two major architectural decisions most teams face.

CDN-Level Caching: The Edge Strategy

Content Delivery Networks distribute cached content across global points of presence (PoPs), bringing your data physically closer to users and reducing latency.

Best Use Cases for CDN Caching

Static Assets
- Images, videos, CSS, JavaScript files
- Fonts and media files
- Static HTML pages
Semi-Dynamic Content
- Product listings that update infrequently
- Blog posts and articles
- Public API responses that change on predictable schedules
Geographically Diverse User Base
- Applications serving users across multiple continents
- Services requiring low global latency

Implementing CDN Caching

The most common way to implement CDN caching is through HTTP cache headers. Here’s an example using Cloudflare:

# HTTP Response headers for static assets
Cache-Control: public, max-age=86400, immutable
# One day caching, with signal that content never changes

# HTTP Response headers for semi-dynamic content
Cache-Control: public, max-age=600, stale-while-revalidate=600
# 10 minutes fresh, serve stale for another 10 minutes while revalidating

For more granular control, many CDNs offer programmatic cache management:

Cloudflare Workers Example:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  // Custom cache key based on URL but excluding certain query parameters
  const cacheKey = new URL(request.url)
  cacheKey.searchParams.delete('utm_source')
  
  // Check cache first
  const cache = caches.default
  let response = await cache.match(cacheKey)
  
  if (!response) {
    // Cache miss - fetch from origin
    response = await fetch(request)
    
    // Only cache successful responses
    if (response.status === 200) {
      // Clone the response and modify headers if needed
      const newResponse = new Response(response.body, response)
      newResponse.headers.set('Cache-Control', 'public, max-age=300')
      
      // Store in cache
      event.waitUntil(cache.put(cacheKey, newResponse.clone()))
      return newResponse
    }
  }
  
  return response
}

CDN Caching Performance Impact

Here’s a real benchmark showing latency reduction with CDN caching:

Request Origin	Target Server	Without CDN	With CDN	Improvement
US East	Europe (Frankfurt)	220ms	45ms	4.9x faster
Southeast Asia	US West (Oregon)	310ms	60ms	5.2x faster
Australia	Europe (Frankfurt)	340ms	70ms	4.9x faster
Global (Average)	Various Regions	290ms	58ms	5.0x faster

Application-Level Caching: The Service Strategy

Application-level caching happens within your services, typically using an in-memory cache or dedicated caching service.

Best Use Cases for Application Caching

User-Specific Data
- Authenticated user profiles
- Shopping cart contents
- Personalized recommendations
Computational Results
- Expensive database queries
- Complex business logic calculations
- Aggregation operations
Service-to-Service Communication
- Internal API responses
- Configuration data
- Shared reference data

Implementing Application Caching

Here are examples using popular caching solutions:

Redis Caching in Go:

package main

import (
	"context"
	"encoding/json"
	"time"
	
	"github.com/go-redis/redis/v8"
)

type UserProfile struct {
	ID        string    `json:"id"`
	Name      string    `json:"name"`
	Email     string    `json:"email"`
	Preferences map[string]interface{} `json:"preferences"`
	UpdatedAt time.Time `json:"updated_at"`
}

func GetUserProfile(ctx context.Context, redisClient *redis.Client, userID string) (*UserProfile, error) {
	cacheKey := "user:profile:" + userID
	
	// Try to get from cache first
	cachedData, err := redisClient.Get(ctx, cacheKey).Bytes()
	if err == nil {
		// Cache hit
		var profile UserProfile
		if err := json.Unmarshal(cachedData, &profile); err == nil {
			return &profile, nil
		}
	}
	
	// Cache miss or unmarshal error - fetch from database
	profile, err := fetchUserProfileFromDB(ctx, userID)
	if err != nil {
		return nil, err
	}
	
	// Store in cache with expiration
	if profileJSON, err := json.Marshal(profile); err == nil {
		// Cache for 15 minutes
		redisClient.Set(ctx, cacheKey, profileJSON, 15*time.Minute)
	}
	
	return profile, nil
}

func fetchUserProfileFromDB(ctx context.Context, userID string) (*UserProfile, error) {
	// Database logic here
	// ...
}

In-Memory Caching with Expiration in Java:

import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;

public class SimpleCache<K, V> {
    private final ConcurrentHashMap<K, CacheEntry<V>> cache = new ConcurrentHashMap<>();
    private final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);

    public void put(K key, V value, long expirationTimeInSeconds) {
        cache.put(key, new CacheEntry<>(value, System.currentTimeMillis() + (expirationTimeInSeconds * 1000)));
    }

    public V get(K key) {
        CacheEntry<V> entry = cache.get(key);
        if (entry == null) {
            return null; // Cache miss
        }
        
        if (entry.isExpired()) {
            cache.remove(key);
            return null; // Expired entry
        }
        
        return entry.getValue();
    }

    public void scheduleCleanup(long initialDelay, long period, TimeUnit unit) {
        scheduler.scheduleAtFixedRate(() -> {
            cache.entrySet().removeIf(entry -> entry.getValue().isExpired());
        }, initialDelay, period, unit);
    }

    private static class CacheEntry<V> {
        private final V value;
        private final long expirationTime;

        public CacheEntry(V value, long expirationTime) {
            this.value = value;
            this.expirationTime = expirationTime;
        }

        public V getValue() {
            return value;
        }

        public boolean isExpired() {
            return System.currentTimeMillis() > expirationTime;
        }
    }
}

Cache Invalidation: The Hard Problem

The complexity of cache invalidation increases with distance from the source of truth. Let’s compare approaches:

Cache Type	Invalidation Mechanism	Complexity	Effectiveness
In-Memory	Direct call/eviction	Low	Immediate
Redis	Explicit DEL/EXPIRE	Low-Medium	Near immediate
API Gateway	Purge API calls	Medium	Seconds of delay
CDN	Cache tags/surrogate keys	High	Minutes of delay
Browser	Cannot force invalidate	Very High	Days of delay

Strategies for Effective Cache Invalidation

Time-Based Expiration
- Set appropriate TTLs based on content volatility
- Example: Product descriptions: 1 day; Product prices: 5 minutes
Event-Based Invalidation
- Trigger cache purges when underlying data changes
- Example: When product price changes → purge product detail cache
Versioned Caching
- Embed version in cache key
- Example: product:1234:v5 instead of just product:1234
Stale-While-Revalidate Pattern
- Continue serving stale content while fetching fresh data
- Example: Cache-Control: max-age=60, stale-while-revalidate=600

Practical Implementation Architecture

Below is a reference architecture for implementing multi-level caching:

┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│  CDN Cache    │◄────┤  API Gateway  │◄────┤  Load Balancer│
│ (Cloudflare)  │     │  (Kong/Envoy) │     │  (NGINX/HAP)  │
└───────┬───────┘     └───────┬───────┘     └───────┬───────┘
        │                     │                     │
        ▼                     ▼                     ▼
┌────────────────────────────────────────────────────────────┐
│                     Application Servers                     │
│                                                            │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐       │
│  │ Local Cache │   │Service Cache│   │Distributed  │       │
│  │ (In-Memory) │◄─►│  (Redis)    │◄─►│Cache (Redis)│       │
│  └─────────────┘   └─────────────┘   └─────────────┘       │
└────────────────────────────┬───────────────────────────────┘
                             │
                             ▼
                 ┌───────────────────────┐
                 │     Database Layer    │
                 │ (PostgreSQL/MongoDB)  │
                 └───────────────────────┘

Real-World Caching Decision Matrix

Here’s a decision framework for determining what to cache where:

Content Type	Change Frequency	Privacy	Recommended Cache Level	TTL
Static assets (JS/CSS)	Release cycle	Public	CDN	1 year + versioned URL
Images	Rarely	Public	CDN	1 month
Product catalog	Daily	Public	CDN + App	1 hour CDN, 5 min App
Product prices	Hourly	Public	Application	5 minutes
User dashboard data	Per interaction	Private	Application	2 minutes
Shopping cart	Per interaction	Private	Application	1 minute
API auth tokens	Per session	Private	Application	Session length
System config	Deployment	Internal	Application	Until changed

Common Caching Pitfalls and Solutions

1. Premature Caching

Problem: Implementing complex caching before understanding performance bottlenecks.

Solution: Measure first. Implement instrumentation to identify actual bottlenecks before adding cache layers.

2. Cache Stampedes

Problem: When many requests simultaneously miss cache and hit your backend.

Solution: Implement request coalescing or the “thundering herd” pattern:

// Example in Go using singleflight
import "golang.org/x/sync/singleflight"

var group singleflight.Group

func GetData(key string) (interface{}, error) {
    // This ensures only one fetch happens for the same key concurrently
    data, err, _ := group.Do(key, func() (interface{}, error) {
        // Expensive operation like database query
        return fetchFromDatabase(key)
    })
    return data, err
}

3. Inconsistent Cache Keys

Problem: Different services using different cache keys for the same data.

Solution: Implement a centralized cache key generation library:

public class CacheKeyGenerator {
    public static String forProduct(long productId) {
        return String.format("product:%d:v1", productId);
    }
    
    public static String forUserProfile(String userId) {
        return String.format("user:profile:%s:v2", userId);
    }
    
    // Add methods for other entity types
}

4. Cache Poisoning

Problem: Invalid data gets cached and distributed.

Solution: Validate data before caching and implement TTLs as a safety mechanism.

Monitoring Your Cache Effectiveness

To ensure your caching strategy works well, monitor these key metrics:

Cache Hit Ratio - Target >90% for optimal performance
Cache Latency - How long cache retrievals take
Origin Requests - Number of requests hitting your backend
Stale Content Served - When outdated content is delivered

Prometheus Metrics Example:

// Java example with Micrometer
@Component
public class CacheMetrics {
    private final MeterRegistry registry;
    private final Counter cacheHits;
    private final Counter cacheMisses;
    private final Timer cacheLatency;

    public CacheMetrics(MeterRegistry registry) {
        this.registry = registry;
        this.cacheHits = registry.counter("cache.hits", "type", "user_profile");
        this.cacheMisses = registry.counter("cache.misses", "type", "user_profile");
        this.cacheLatency = registry.timer("cache.latency", "type", "user_profile");
    }

    public void recordCacheHit() {
        cacheHits.increment();
    }

    public void recordCacheMiss() {
        cacheMisses.increment();
    }

    public <T> T measureCacheLatency(Supplier<T> cacheOperation) {
        return cacheLatency.record(cacheOperation);
    }
}

Conclusion

Effective caching strategy requires thoughtful consideration of data characteristics, access patterns, and invalidation requirements. By applying the right caching solution at the right level, you can achieve significant performance improvements while maintaining data correctness.

Remember these guiding principles:

Public, slow-changing content → CDN caching
Private, dynamic, or user-specific content → Application caching
Always have an invalidation strategy before implementing caching
Monitor cache effectiveness continuously

What caching strategies have you implemented in your infrastructure? Have you encountered challenges with multi-level caching? Share your experiences in the comments below.