Skip to main content

SECTION 0 — WHY THIS CHAPTER EXISTS (Senior Fullstack Reality)

Senior fullstack engineers don’t just “build screens” or “ship endpoints.”

You build end-to-end systems where failures, latency, correctness bugs, and security issues cross boundaries:

  • UI state ↔ API contracts

  • API correctness ↔ data invariants

  • Storage ↔ performance and cost

  • Async pipelines ↔ reliability

  • Observability ↔ speed of recovery

This chapter gives you a reusable fullstack design loop and one deep case study you can pattern-match onto almost every product feature.


SECTION 1 — THE FULLSTACK DESIGN LOOP (Staff-ish Thinking, Senior Execution)

Use this loop before you write code:

1) Define the UX + user promises
2) Define invariants (what must never be violated)
3) Define contracts (UI↔API, service↔DB, async boundaries)
4) Design failure modes first
5) Allocate latency + reliability budgets
6) Make it observable
7) Choose rollout + rollback strategy

If you do this consistently, you stop shipping “features” and start shipping systems that survive production.


SECTION 2 — CASE STUDY: FILE UPLOAD PIPELINE (The Senior Fullstack Trapdoor)

File upload is deceptively simple.

It is one of the most common places teams accidentally ship:

  • security vulnerabilities

  • massive infra costs

  • poor UX (stuck uploads, missing progress, phantom success)

  • unreliable async processing (lost jobs, duplicate processing)

  • hard-to-debug failures

We will design:

UI Upload → Storage → Validation/Scan → Processing → Delivery


SECTION 3 — REQUIREMENTS (FUNCTIONAL + NON-FUNCTIONAL)

Functional requirements

  • Users can upload files from web UI.

  • Show upload progress and completion state.

  • Support retry on failure.

  • Uploaded files become available as an attachment on an entity (e.g., ticket, message, profile).

  • Optional: generate derived artifacts (thumbnails, previews, transcoding).

Non-functional requirements (this is where seniors win)

  • Security: prevent arbitrary upload abuse, malware, and unauthorized access.

  • Correctness: no “phantom uploaded” states; avoid duplicates on retry.

  • Scalability: handle spikes (batch uploads, large files).

  • Cost: avoid routing bytes through app servers.

  • Observability: explain failures quickly.

  • UX: consistent states (pending/uploading/processing/ready/failed).


SECTION 4 — KEY INVARIANTS (THE RULES YOU MUST PROTECT)

These invariants prevent the common disasters:

  1. An attachment is either not present, or points to a verified stored object.

  2. A file is never marked “ready” until validation/scan completes successfully.

  3. Uploads are idempotent from the UI perspective (retry doesn’t create multiple attachments).

  4. Authorization is enforced at access time, not “best effort” via obscurity.


SECTION 5 — ARCHITECTURE (DON’T SEND BYTES THROUGH YOUR API)

High-level components

  • Frontend: upload UI + state machine

  • API: issues upload intents, tracks state, enforces auth

  • Object storage: S3/GCS/R2 (or equivalent)

  • Scanner/validator: malware scan + file type validation

  • Processor: thumbnails/transcode/etc.

  • CDN: serves finalized files

Golden rule

App servers should handle metadata, not file bytes.

(1) UI -> API: POST /uploads/intents
(2) API -> UI: uploadId + presignedUrl(s) + constraints
(3) UI -> Storage: PUT bytes (direct)
(4) Storage -> Event: object-created
(5) Worker: scan/validate -> update upload state
(6) Worker: process artifacts -> update state to READY
(7) UI polls or subscribes -> shows READY + download URL

SECTION 6 — DATA MODEL (MAKE STATES EXPLICIT)

A senior-friendly model is explicit and queryable:

  • uploads

    • id

    • owner_user_id

    • entity_type, entity_id (what this attaches to)

    • state: INITIATED | UPLOADING | UPLOADED | SCANNING | PROCESSING | READY | FAILED

    • content_type, size_bytes, checksum (optional but powerful)

    • storage_key

    • failure_code, failure_message

    • created_at, updated_at

Why this matters:

  • UI can render truthfully.

  • Support + on-call can see what happened.

  • You can build dashboards and alerts off state transitions.


SECTION 7 — API CONTRACT (UI ↔ BACKEND)

Create upload intent

  • POST /uploads/intents

    • body: { entityType, entityId, filename, contentType, sizeBytes }

    • returns:

      • uploadId

      • presignedUrl (or multipart URLs)

      • constraints (max size, allowed types)

      • expiresAt

Report completion (optional)

Some teams add:

  • POST /uploads/{uploadId}/complete

This is useful when storage events are delayed/unavailable.

Query status

  • GET /uploads/{uploadId}{ state, failureCode, downloadUrl? }

Contract rules:

  • State is authoritative.

  • Errors are typed (e.g., FILE_TOO_LARGE, UNSUPPORTED_TYPE, MALWARE_DETECTED, AUTH_DENIED).

  • READY implies safe to serve.


SECTION 8 — FRONTEND STATE MACHINE (STOP LYING TO USERS)

A robust UI has explicit states:

IDLE
-> SELECTED
-> UPLOADING (progress)
-> UPLOADED (bytes done)
-> PROCESSING (server-side)
-> READY
-> FAILED (retry)

Senior-level UX behaviors:

  • Preserve progress on transient failures when possible.

  • Retry with the same uploadId (idempotency).

  • Show “uploaded, processing…” separately from “uploading…”

  • Handle navigation/reload: the UI should rehydrate from GET /uploads/{id}.


SECTION 9 — FAILURE MODES (THE REAL INTERVIEW)

Map failure modes across boundaries:

UI / network failures

  • user closes tab mid-upload

  • flaky wifi

  • duplicate clicks

Mitigations:

  • resumable/multipart uploads for large files

  • idempotent intent creation (client-generated key)

Storage failures

  • presigned URL expired

  • partial multipart upload

Mitigations:

  • short expiry + refresh path

  • server-side cleanup of abandoned uploads

Async pipeline failures

  • event not delivered

  • worker crash mid-processing

  • duplicates (at-least-once)

Mitigations:

  • idempotent workers

  • state transitions with compare-and-swap semantics

  • DLQ + replay

Security failures

  • uploading executable disguised as image

  • public access to private files

  • SSRF-like patterns via URL-based upload (avoid)

Mitigations:

  • validate by magic bytes, not filename

  • serve via authenticated download endpoint or signed CDN URLs

  • scan before READY


SECTION 10 — OBSERVABILITY (MAKE IT DEBUGGABLE IN 5 MINUTES)

Minimum viable telemetry:

  • Metrics:

    • intent created count

    • upload READY count

    • failure count by failure_code

    • time-to-ready histogram

  • Logs:

    • include uploadId, entityId, storageKey
  • Traces:

    • intent creation

    • status queries

    • worker processing spans

Senior rule:

If you can’t answer “why is this stuck?” quickly, you didn’t finish the feature.


SECTION 11 — ROLLOUT PLAN (SHIP SAFELY)

  • Feature flag the new upload flow.

  • Shadow-write upload records first (observe states).

  • Gradually increase percentage.

  • Keep rollback path (old flow) until error budgets are stable.


SECTION 12 — EXERCISES (PRACTICE LIKE A SENIOR)

  1. Draw the full flow and label all boundaries (UI/API/storage/worker/CDN).

  2. Define a complete error taxonomy for uploads.

  3. Write a one-page “failure mode map” and list mitigations.

  4. Propose a latency budget: time-to-first-progress, time-to-bytes-complete, time-to-ready.

  5. Decide: do you need multipart/resumable uploads? What size threshold triggers it?


🏁 END — FULLSTACK SYSTEM DESIGN CASE STUDY 1