Skip to main content

CASE STUDY — Part III: Designing a Rate Limiter (Edge + Service + DB)

SCENARIO

Your public API is being abused.

spikes cause DB saturation
attackers rotate IPs
legitimate users get degraded

You need a rate limiter that is:

fair
hard to bypass
observable
cheap to operate

CONSTRAINTS

multiple regions
at-least-once retries exist
traffic is bursty

DESIGN (MULTI-LAYER)

Use layers:

Edge/CDN/WAF: coarse IP-based rules (cheap)
API Gateway: per API key / per user
Service-level: per tenant, per route, per expensive operation

Rule:

One limiter is never enough. You need the right limiter at the right boundary.

ALGORITHM CHOICE

token bucket for bursts
sliding window for fairness

Implementation detail:

centralized store (Redis) for counters/tokens
key schema includes: tenantId + route + time bucket

FAILURE MODES

Redis down → fail open or fail closed?
- for auth endpoints: fail closed
- for low-risk reads: fail open with fallback limits
stampede on shared keys → shard keys + local caching

CONTRACT

Return:

429 with typed error RATE_LIMITED
Retry-After header

UI behavior:

show clear message
backoff and retry

OBSERVABILITY

rate-limited counts by key dimension (tenant, route)
top offenders
false positives (support signals)

EXERCISE

Design your limiter policy table:

endpoint
limit
scope (user/tenant/ip)
fail open/closed

🏁 END — PART III CASE STUDY