Effective caching is fundamental to scaling web applications. This article explores the strategic differences between application-level and CDN-level caching, with practical guidance on what to cache where and why.

Application-Level vs CDN-Level Caching: Strategic Implementation Guide

When scaling web applications, caching is often one of the first optimization strategies implemented. However, determining what to cache and where to cache it is a nuanced decision that can significantly impact application performance, architecture, and even correctness. This guide dives into the differences between application-level and CDN-level caching, with practical recommendations based on real-world experience.

Understanding the Caching Hierarchy

Caching can occur at multiple levels in your application stack:

  1. Browser Cache - The closest to the user
  2. CDN Cache - Edge network, geographically distributed
  3. API Gateway Cache - Front door to your services
  4. Application Cache - In-memory or distributed caches
  5. Database Cache - Query and result caching

For this article, we’ll focus primarily on the distinction between CDN-level caching (level 2) and application-level caching (level 4), as these represent the two major architectural decisions most teams face.

CDN-Level Caching: The Edge Strategy

Content Delivery Networks distribute cached content across global points of presence (PoPs), bringing your data physically closer to users and reducing latency.

Best Use Cases for CDN Caching

  1. Static Assets

    • Images, videos, CSS, JavaScript files
    • Fonts and media files
    • Static HTML pages
  2. Semi-Dynamic Content

    • Product listings that update infrequently
    • Blog posts and articles
    • Public API responses that change on predictable schedules
  3. Geographically Diverse User Base

    • Applications serving users across multiple continents
    • Services requiring low global latency

Implementing CDN Caching

The most common way to implement CDN caching is through HTTP cache headers. Here’s an example using Cloudflare:

# HTTP Response headers for static assets
Cache-Control: public, max-age=86400, immutable
# One day caching, with signal that content never changes

# HTTP Response headers for semi-dynamic content
Cache-Control: public, max-age=600, stale-while-revalidate=600
# 10 minutes fresh, serve stale for another 10 minutes while revalidating

For more granular control, many CDNs offer programmatic cache management:

Cloudflare Workers Example:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  // Custom cache key based on URL but excluding certain query parameters
  const cacheKey = new URL(request.url)
  cacheKey.searchParams.delete('utm_source')
  
  // Check cache first
  const cache = caches.default
  let response = await cache.match(cacheKey)
  
  if (!response) {
    // Cache miss - fetch from origin
    response = await fetch(request)
    
    // Only cache successful responses
    if (response.status === 200) {
      // Clone the response and modify headers if needed
      const newResponse = new Response(response.body, response)
      newResponse.headers.set('Cache-Control', 'public, max-age=300')
      
      // Store in cache
      event.waitUntil(cache.put(cacheKey, newResponse.clone()))
      return newResponse
    }
  }
  
  return response
}

CDN Caching Performance Impact

Here’s a real benchmark showing latency reduction with CDN caching:

Request OriginTarget ServerWithout CDNWith CDNImprovement
US EastEurope (Frankfurt)220ms45ms4.9x faster
Southeast AsiaUS West (Oregon)310ms60ms5.2x faster
AustraliaEurope (Frankfurt)340ms70ms4.9x faster
Global (Average)Various Regions290ms58ms5.0x faster

Application-Level Caching: The Service Strategy

Application-level caching happens within your services, typically using an in-memory cache or dedicated caching service.

Best Use Cases for Application Caching

  1. User-Specific Data

    • Authenticated user profiles
    • Shopping cart contents
    • Personalized recommendations
  2. Computational Results

    • Expensive database queries
    • Complex business logic calculations
    • Aggregation operations
  3. Service-to-Service Communication

    • Internal API responses
    • Configuration data
    • Shared reference data

Implementing Application Caching

Here are examples using popular caching solutions:

Redis Caching in Go:

package main

import (
	"context"
	"encoding/json"
	"time"
	
	"github.com/go-redis/redis/v8"
)

type UserProfile struct {
	ID        string    `json:"id"`
	Name      string    `json:"name"`
	Email     string    `json:"email"`
	Preferences map[string]interface{} `json:"preferences"`
	UpdatedAt time.Time `json:"updated_at"`
}

func GetUserProfile(ctx context.Context, redisClient *redis.Client, userID string) (*UserProfile, error) {
	cacheKey := "user:profile:" + userID
	
	// Try to get from cache first
	cachedData, err := redisClient.Get(ctx, cacheKey).Bytes()
	if err == nil {
		// Cache hit
		var profile UserProfile
		if err := json.Unmarshal(cachedData, &profile); err == nil {
			return &profile, nil
		}
	}
	
	// Cache miss or unmarshal error - fetch from database
	profile, err := fetchUserProfileFromDB(ctx, userID)
	if err != nil {
		return nil, err
	}
	
	// Store in cache with expiration
	if profileJSON, err := json.Marshal(profile); err == nil {
		// Cache for 15 minutes
		redisClient.Set(ctx, cacheKey, profileJSON, 15*time.Minute)
	}
	
	return profile, nil
}

func fetchUserProfileFromDB(ctx context.Context, userID string) (*UserProfile, error) {
	// Database logic here
	// ...
}

In-Memory Caching with Expiration in Java:

import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;

public class SimpleCache<K, V> {
    private final ConcurrentHashMap<K, CacheEntry<V>> cache = new ConcurrentHashMap<>();
    private final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);

    public void put(K key, V value, long expirationTimeInSeconds) {
        cache.put(key, new CacheEntry<>(value, System.currentTimeMillis() + (expirationTimeInSeconds * 1000)));
    }

    public V get(K key) {
        CacheEntry<V> entry = cache.get(key);
        if (entry == null) {
            return null; // Cache miss
        }
        
        if (entry.isExpired()) {
            cache.remove(key);
            return null; // Expired entry
        }
        
        return entry.getValue();
    }

    public void scheduleCleanup(long initialDelay, long period, TimeUnit unit) {
        scheduler.scheduleAtFixedRate(() -> {
            cache.entrySet().removeIf(entry -> entry.getValue().isExpired());
        }, initialDelay, period, unit);
    }

    private static class CacheEntry<V> {
        private final V value;
        private final long expirationTime;

        public CacheEntry(V value, long expirationTime) {
            this.value = value;
            this.expirationTime = expirationTime;
        }

        public V getValue() {
            return value;
        }

        public boolean isExpired() {
            return System.currentTimeMillis() > expirationTime;
        }
    }
}

Cache Invalidation: The Hard Problem

The complexity of cache invalidation increases with distance from the source of truth. Let’s compare approaches:

Cache TypeInvalidation MechanismComplexityEffectiveness
In-MemoryDirect call/evictionLowImmediate
RedisExplicit DEL/EXPIRELow-MediumNear immediate
API GatewayPurge API callsMediumSeconds of delay
CDNCache tags/surrogate keysHighMinutes of delay
BrowserCannot force invalidateVery HighDays of delay

Strategies for Effective Cache Invalidation

  1. Time-Based Expiration

    • Set appropriate TTLs based on content volatility
    • Example: Product descriptions: 1 day; Product prices: 5 minutes
  2. Event-Based Invalidation

    • Trigger cache purges when underlying data changes
    • Example: When product price changes → purge product detail cache
  3. Versioned Caching

    • Embed version in cache key
    • Example: product:1234:v5 instead of just product:1234
  4. Stale-While-Revalidate Pattern

    • Continue serving stale content while fetching fresh data
    • Example: Cache-Control: max-age=60, stale-while-revalidate=600

Practical Implementation Architecture

Below is a reference architecture for implementing multi-level caching:

┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│  CDN Cache    │◄────┤  API Gateway  │◄────┤  Load Balancer│
│ (Cloudflare)  │     │  (Kong/Envoy) │     │  (NGINX/HAP)  │
└───────┬───────┘     └───────┬───────┘     └───────┬───────┘
        │                     │                     │
        ▼                     ▼                     ▼
┌────────────────────────────────────────────────────────────┐
│                     Application Servers                     │
│                                                            │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐       │
│  │ Local Cache │   │Service Cache│   │Distributed  │       │
│  │ (In-Memory) │◄─►│  (Redis)    │◄─►│Cache (Redis)│       │
│  └─────────────┘   └─────────────┘   └─────────────┘       │
└────────────────────────────┬───────────────────────────────┘
                             │
                             ▼
                 ┌───────────────────────┐
                 │     Database Layer    │
                 │ (PostgreSQL/MongoDB)  │
                 └───────────────────────┘

Real-World Caching Decision Matrix

Here’s a decision framework for determining what to cache where:

Content TypeChange FrequencyPrivacyRecommended Cache LevelTTL
Static assets (JS/CSS)Release cyclePublicCDN1 year + versioned URL
ImagesRarelyPublicCDN1 month
Product catalogDailyPublicCDN + App1 hour CDN, 5 min App
Product pricesHourlyPublicApplication5 minutes
User dashboard dataPer interactionPrivateApplication2 minutes
Shopping cartPer interactionPrivateApplication1 minute
API auth tokensPer sessionPrivateApplicationSession length
System configDeploymentInternalApplicationUntil changed

Common Caching Pitfalls and Solutions

1. Premature Caching

Problem: Implementing complex caching before understanding performance bottlenecks.

Solution: Measure first. Implement instrumentation to identify actual bottlenecks before adding cache layers.

2. Cache Stampedes

Problem: When many requests simultaneously miss cache and hit your backend.

Solution: Implement request coalescing or the “thundering herd” pattern:

// Example in Go using singleflight
import "golang.org/x/sync/singleflight"

var group singleflight.Group

func GetData(key string) (interface{}, error) {
    // This ensures only one fetch happens for the same key concurrently
    data, err, _ := group.Do(key, func() (interface{}, error) {
        // Expensive operation like database query
        return fetchFromDatabase(key)
    })
    return data, err
}

3. Inconsistent Cache Keys

Problem: Different services using different cache keys for the same data.

Solution: Implement a centralized cache key generation library:

public class CacheKeyGenerator {
    public static String forProduct(long productId) {
        return String.format("product:%d:v1", productId);
    }
    
    public static String forUserProfile(String userId) {
        return String.format("user:profile:%s:v2", userId);
    }
    
    // Add methods for other entity types
}

4. Cache Poisoning

Problem: Invalid data gets cached and distributed.

Solution: Validate data before caching and implement TTLs as a safety mechanism.

Monitoring Your Cache Effectiveness

To ensure your caching strategy works well, monitor these key metrics:

  1. Cache Hit Ratio - Target >90% for optimal performance
  2. Cache Latency - How long cache retrievals take
  3. Origin Requests - Number of requests hitting your backend
  4. Stale Content Served - When outdated content is delivered

Prometheus Metrics Example:

// Java example with Micrometer
@Component
public class CacheMetrics {
    private final MeterRegistry registry;
    private final Counter cacheHits;
    private final Counter cacheMisses;
    private final Timer cacheLatency;

    public CacheMetrics(MeterRegistry registry) {
        this.registry = registry;
        this.cacheHits = registry.counter("cache.hits", "type", "user_profile");
        this.cacheMisses = registry.counter("cache.misses", "type", "user_profile");
        this.cacheLatency = registry.timer("cache.latency", "type", "user_profile");
    }

    public void recordCacheHit() {
        cacheHits.increment();
    }

    public void recordCacheMiss() {
        cacheMisses.increment();
    }

    public <T> T measureCacheLatency(Supplier<T> cacheOperation) {
        return cacheLatency.record(cacheOperation);
    }
}

Conclusion

Effective caching strategy requires thoughtful consideration of data characteristics, access patterns, and invalidation requirements. By applying the right caching solution at the right level, you can achieve significant performance improvements while maintaining data correctness.

Remember these guiding principles:

  1. Public, slow-changing content → CDN caching
  2. Private, dynamic, or user-specific content → Application caching
  3. Always have an invalidation strategy before implementing caching
  4. Monitor cache effectiveness continuously

What caching strategies have you implemented in your infrastructure? Have you encountered challenges with multi-level caching? Share your experiences in the comments below.