Skip to main content

SECTION 1 — THE PRIME DIRECTIVE OF DATA DESIGN

“Data outlives code.”

This is one of the first rules you learn at companies like Google and Stripe.

  • Code can be rewritten.

  • Services can be replaced.

  • Frameworks go out of style.

  • APIs change.

But data structures are permanent.

A bad schema haunts a company for years.

This is why Staff Engineers focus on data modeling FIRST, not API design or microservices.


SECTION 2 — SQL vs NoSQL: THINK IN TERMS OF DATA SHAPE

Most engineers choose databases based on trends or "what the company uses."

Top engineers choose based on data shape.


Use SQL when data is:

  • relational

  • structured

  • transactional

  • highly consistent

  • joins are natural

  • strong constraints exist (foreign keys, unique keys, checks)

Examples:

  • fintech

  • orders/payments

  • healthcare

  • authentication

  • booking flows


Use NoSQL when data is:

  • semi-structured

  • nested

  • document-oriented

  • variable in shape

  • primarily key-value access

  • high-volume reads

  • low-latency

Examples:

  • session storage

  • activity logs

  • user preferences

  • chat messages

  • product catalog

  • caching layer


RULE OF STAFF ENGINEERS:

If relationships drive your logic → SQL is almost always better.
If documents drive your logic → NoSQL is often better.


SECTION 3 — DATA MODELING MENTAL MODELS

Here are the models used by architects at Stripe, Shopify, Airbnb.


Model 1 — Entities & Relationships

Every system has:

  • Entities (User, Order, Message, Pet, CallSession)

  • Relationships (1–1, 1–many, many–many)

Understanding these defines your tables/collections.


Model 2 — Access Patterns First

This is critical:

Design for how the data will be READ and WRITTEN.
Not how you think the data “should look.”

Examples:

  • If you often need user + profile + settings → denormalize into one table or add a view.

  • If you frequently need order + items → model it together and index foreign keys.

Access patterns determine:

  • indexing

  • denormalization

  • table layout

  • caching strategy


Model 3 — State Machines → Data Models

Modeling flows as state machines gives you:

state: ENUM
updated_at
transition_reason
previous_state

This makes debugging and analytics easy.

Examples:

  • Payment: pending → authorized → captured → refunded

  • Enrollment: draft → submitted → review → approved → completed

  • Video Call: idle → ringing → connected → ended

State-first modeling removes ambiguity.


Model 4 — Invariants

The most important part of data modeling.

Examples:

  • A user cannot have two active carts.

  • A payment cannot be modified after capture.

  • A pet must belong to exactly one owner.

  • A call session must always have a unique channel ID.

Invariants → constraints → schema design.


Model 5 — Data Ownership

Identify which service owns which data.

Example:

  • Auth owns User Credentials

  • Profile service owns display info

  • Billing owns balance + payments

This prevents:

  • duplication

  • inconsistency

  • cross-service entanglement


SECTION 4 — INDEXING MASTERY

This is where Senior → Staff transitions happen.

Indexing can turn a 5s query → 5ms.

Understanding indexing is Staff Engineer territory.


Index Type 1 — Single Column Index

Example:

INDEX (email)

Use for:

  • unique lookup

  • high cardinality fields

  • login queries


Index Type 2 — Composite Index

Order matters.

INDEX (user_id, created_at)

Query must match prefix ordering.

Perfect for:

  • filtering + sorting

  • pagination

  • dashboard queries


Index Type 3 — Covering Index

Index contains all needed fields → query never touches the table.

Huge performance boost.


Index Type 4 — Partial Index

Example:

WHERE status = 'active'

Used for:

  • hot-path queries

  • sparse data

  • filtering heavy workloads


Index Type 5 — Unique Index

Used to enforce invariants.

Example:

UNIQUE (user_id, date)

Meaning:

  • a user can have only one record per date.

Schema enforces business logic.

This is Staff-level thinking.


SECTION 5 — SHARDING (THE REAL VERSION)

Most people misunderstand sharding.

Sharding solves write scaling, NOT read scaling.

Reads scale via:

  • caching

  • replicas

  • materialized views

Sharding is for:

  • massive write throughput

  • extremely large datasets

  • distributing ownership

  • reducing hot rows


Shard Key Rules

A good shard key:

  • evenly distributes load

  • minimizes cross-shard queries

  • never requires rebalancing

Examples:

  • user_id (great for social apps)

  • order_id (perfect for e-commerce)

  • hashed values (even distribution)

  • geo-based keys (location apps)

A BAD shard key:

  • timestamp

  • random UUID

  • boolean flags

Staff Engineers ALWAYS choose shard keys based on access patterns.


SECTION 6 — CONSISTENCY MODELS (PRACTICAL VERSION)

Forget abstract theory — here’s how consistency works in real systems.


1. Strong Consistency

Guarantees:

  • reads reflect latest writes

  • no stale data

Used for:

  • payments

  • authentication

  • state transitions

Examples:

  • PostgreSQL

  • MySQL

  • Spanner


2. Eventual Consistency

Guarantees:

  • system converges to correctness

  • temporary stale reads allowed

Used for:

  • feeds

  • analytics

  • notifications

  • search indexing

Examples:

  • DynamoDB

  • Cassandra

  • ElasticSearch


3. Causal Consistency

Reads follow the order of writes as seen by the client.

Used for:

  • chat apps

  • collaborative tools

  • messaging


SECTION 7 — DENORMALIZATION (THE STAFF ENGINEER APPROACH)

Average engineers say:

“Normalize everything.”

Staff engineers say:

“Normalize for correctness, denormalize for performance.”

Rules of safe denormalization:

  • Only denormalize immutable facts

  • Store computed fields (e.g., total_amount)

  • Use triggers/jobs to keep data consistent

  • Denormalize when read patterns demand it

Examples:

  • storing user_name inside order table

  • storing monthly summary inside analytics tables

  • storing last_message inside conversation table


SECTION 8 — EVENT SOURCING (DEEP EXPLANATION)

Event sourcing stores changes, not current state.

Example event stream:

Created
ItemAdded
ItemAdded
ItemRemoved
CheckedOut
Paid

Replaying events → reconstruct current state.

Benefits:

  • perfect audit log

  • time-travel debugging

  • strong consistency

  • ideal for financial systems

Costs:

  • complexity

  • replay overhead

  • strong schema discipline required

Not every system needs event sourcing — but knowing it is Staff-level knowledge.


SECTION 9 — CAP THEOREM (REAL WORLD VERSION)

CAP says you must choose 2 out of 3 during a network partition:

  • Consistency

  • Availability

  • Partition Tolerance

BUT in the real world:

  • Distributed systems must always tolerate partitions

  • So the real choice is:

    Consistency vs Availability

CP systems:

  • payments

  • banking

  • auth

  • state machines

AP systems:

  • messaging

  • notifications

  • search

  • feeds

Knowing the correct category is a Staff-level skill.