Skip to main content

SECTION 1 — WHAT SYSTEM DESIGN REALLY MEANS

Most people think system design is:

  • drawing boxes and arrows

  • adding caches

  • picking databases

  • scaling APIs

Wrong.

System design is the art of understanding and controlling complexity across time.

Real system design is:

  • Modeling reality

  • Understanding constraints

  • Designing boundaries

  • Managing failure

  • Predicting load

  • Handling scale

  • Ensuring correctness

  • Maintaining evolution paths

  • Enforcing invariants

  • Reducing entropy

System design = the language of Staff & Principal Engineers.


SECTION 2 — THE 4 ELEMENTS OF SYSTEM DESIGN

Every system, regardless of size, can be understood through these four elements:


Element 1 — Data

  • What data exists?

  • How is it stored?

  • What is the structure?

  • What are the relationships?

  • What are the integrity constraints?

  • How does data change over time?


Element 2 — Flow

  • How does data move?

  • Which services call each other?

  • What are the request/response patterns?

  • What is synchronous vs asynchronous?


Element 3 — State

  • What states does the system have?

  • Where is state stored?

  • Who owns the state?

  • How is state mutated?

  • What state transitions exist?


Element 4 — Boundaries

  • What belongs to frontend vs backend?

  • What belongs in the domain layer?

  • What belongs in a service?

  • What is internal vs external API?

  • What must remain invariant?


When you master these four concepts, you can design ANY system.


SECTION 3 — SYSTEM DESIGN MENTAL MODELS

Here are the mental models top engineers use before they draw a single box.


Model 1 — The Request Lifecycle

Every request follows this chain:

Client → Network → CDN → Load Balancer → Reverse Proxy/API Gateway
→ Authentication Layer → Routing → Service → Cache
→ Database → Downstream Services → Response

A top engineer visualizes:

  • latency sources

  • failure points

  • concurrency risks

  • throughput limits

This gives you architectural intuition.


Model 2 — The “Hot Path” Identification

Before optimizing ANYTHING, answer:

“What are the hot paths of this system?”

Hot paths =

the high-traffic, high-latency-sensitive flows.

Examples:

  • login

  • payment processing

  • video call signaling

  • real-time messaging

  • inventory reservation

Hot paths get:

  • caching

  • load balancing

  • replication

  • precomputation

  • careful data modeling

Non-hot paths get simpler designs.

This is how Stripe, Uber, Shopify architect systems.


Model 3 — The “Bounded Context” Principle (From DDD)

This is the MOST important boundary model.

A Bounded Context is:

  • a domain with clear responsibilities

  • its own consistent language

  • its own data ownership

  • isolated business rules

Example (simplified for clarity):

Payments       |   User Profiles   |   Notifications
---------------------------------------------------
Owns charges | Owns profile data | Owns messages

Never mix rules across contexts — this is what makes systems rot.


Model 4 — The “Read vs Write Path Split”

Reads and writes have VERY different constraints.

Read path:

  • optimized for speed

  • caching

  • denormalization

  • replicas

Write path:

  • correctness

  • ordering

  • validation

  • idempotency

  • transactional integrity

Top engineers ALWAYS design reads and writes separately.


Model 5 — The “State Machine First” Model

Every system is a state machine:

Example: Video call

idle → calling → ringing → connecting → in_call → ended

Example: Onboarding flow

start → input → validate → review → sign → completed

By modeling your system as a state machine, you eliminate:

  • ambiguity

  • impossible transitions

  • inconsistent states

  • unexpected behavior

This is EXACTLY why your Digital Enrollment, Veterinary Call Flow, and SDK flows become clean when modeled this way.


Model 6 — The Failure-Propagation Model

Failures cascade like this:

DB slow → queue backlog → API latency → retries → traffic spike → meltdown

Top engineers preemptively ask:

  • What happens if DB slows 10x?

  • What happens if cache is cold?

  • What if external API fails?

  • What if network partitions?

This is the foundation of fault-tolerant design.


Model 7 — The “Scalability Triggers” Model

Systems scale at these breaking points:

1. Too many reads → Add caching

2. Too many writes → Shard or queue

3. Too much compute → Add workers

4. Data too large → Partition or archive

5. Too many dependencies → Introduce pub/sub

6. Hot key → Add hashing layer

7. Global scale → Add CDNs + geo-replication

Knowing these triggers makes scaling predictable.


SECTION 4 — TYPES OF SYSTEM DESIGN

There are four categories of system design you must master.


Type 1 — High-Level System Architecture

This includes:

  • microservices

  • monoliths

  • event-driven systems

  • streaming architectures

  • CQRS

  • serverless flows

You learn how components interconnect.


Type 2 — API & Domain Design

This includes:

  • REST

  • GraphQL

  • gRPC

  • event contracts

  • request/response shapes

  • naming conventions

  • pagination

  • versioning

  • breaking change rules

This affects all system boundaries.


Type 3 — Distributed Systems

This includes:

  • consensus

  • replication

  • sharding

  • leader election

  • partition tolerance

  • retries & backoff

  • idempotency

  • eventual consistency

Distributed systems thinking is Staff-level thinking.


Type 4 — Reliability & Observability

This includes:

  • SLOs

  • SLIs

  • Error budgets

  • Logging

  • Metrics

  • Traces

  • Alert tuning

  • On-call readiness

Observability = the difference between a stable system and a chaotic one.


SECTION 5 — THE 6 UNIVERSAL SYSTEM COMPONENTS

Every large system is composed of these building blocks:


Component 1 — API Layer

Handles:

  • routing

  • authentication

  • authorization

  • validation

  • rate limits


Component 2 — Business Logic Layer

Handles:

  • rules

  • workflows

  • orchestration

  • invariants


Component 3 — Storage Layer

Options:

  • SQL

  • NoSQL

  • KV stores

  • caches

  • blob storage

Design here must handle:

  • consistency

  • indexing

  • query patterns


Component 4 — Caching Layer

Used for:

  • performance

  • load reduction

  • response acceleration

Caches include:

  • CDN

  • Redis

  • Memcached

  • in-memory


Component 5 — Messaging Layer

Queues & streams:

  • Kafka

  • RabbitMQ

  • SQS

  • Pub/Sub

Used for:

  • decoupling

  • reliability

  • retries

  • async processing


Component 6 — Compute Layer

Workers or serverless:

  • job processing

  • data pipelines

  • background tasks


Understanding these deeply allows you to build any architecture.


SECTION 6 — BUILDING ARCHITECTURAL INTUITION

This is where you develop true architectural taste — the thing that makes companies treat you as Staff level.


Taste Builder 1 — Choosing the Right Complexity

Every feature has:

  • simple solution

  • scalable solution

  • over-engineered solution

Top engineers ALWAYS choose:

  • the simplest solution that can scale

  • not the “fancy” one


Taste Builder 2 — Designing for Predictability

A predictable system is:

  • easy to debug

  • easy to reason about

  • easy to evolve

  • cheaper to maintain

Predictability > flexibility.


Taste Builder 3 — Making the Right Tradeoffs

Architecture is balancing:

  • consistency

  • performance

  • cost

  • latency

  • reliability

There is no perfect design

only a design optimized for the correct constraints.

This is Staff-level decision-making.