Skip to main content

WHY THIS CHAPTER EXISTS (Senior Fullstack Reality)

Senior fullstack engineers don’t just “build screens” or “ship endpoints.”

You build end-to-end systems where failures, latency, correctness bugs, and security issues cross boundaries:

  • UI state ↔ API contracts

  • API correctness ↔ data invariants

  • Storage ↔ performance and cost

  • Async pipelines ↔ reliability

  • Observability ↔ speed of recovery

This chapter gives you a reusable fullstack design loop and one deep case study you can pattern-match onto almost every product feature.


THE FULLSTACK DESIGN LOOP (Staff-ish Thinking, Senior Execution)

Use this loop before you write code:

1) Define the UX + user promises
2) Define invariants (what must never be violated)
3) Define contracts (UI↔API, service↔DB, async boundaries)
4) Design failure modes first
5) Allocate latency + reliability budgets
6) Make it observable
7) Choose rollout + rollback strategy

If you do this consistently, you stop shipping “features” and start shipping systems that survive production.


CASE STUDY: FILE UPLOAD PIPELINE (The Senior Fullstack Trapdoor)

File upload is deceptively simple.

It is one of the most common places teams accidentally ship:

  • security vulnerabilities

  • massive infra costs

  • poor UX (stuck uploads, missing progress, phantom success)

  • unreliable async processing (lost jobs, duplicate processing)

  • hard-to-debug failures

We will design:

UI Upload → Storage → Validation/Scan → Processing → Delivery


REQUIREMENTS (FUNCTIONAL + NON-FUNCTIONAL)

Functional requirements

  • Users can upload files from web UI.

  • Show upload progress and completion state.

  • Support retry on failure.

  • Uploaded files become available as an attachment on an entity (e.g., ticket, message, profile).

  • Optional: generate derived artifacts (thumbnails, previews, transcoding).

Non-functional requirements (this is where seniors win)

  • Security: prevent arbitrary upload abuse, malware, and unauthorized access.

  • Correctness: no “phantom uploaded” states; avoid duplicates on retry.

  • Scalability: handle spikes (batch uploads, large files).

  • Cost: avoid routing bytes through app servers.

  • Observability: explain failures quickly.

  • UX: consistent states (pending/uploading/processing/ready/failed).


KEY INVARIANTS (THE RULES YOU MUST PROTECT)

These invariants prevent the common disasters:

  1. An attachment is either not present, or points to a verified stored object.

  2. A file is never marked “ready” until validation/scan completes successfully.

  3. Uploads are idempotent from the UI perspective (retry doesn’t create multiple attachments).

  4. Authorization is enforced at access time, not “best effort” via obscurity.


ARCHITECTURE (DON’T SEND BYTES THROUGH YOUR API)

High-level components

  • Frontend: upload UI + state machine

  • API: issues upload intents, tracks state, enforces auth

  • Object storage: S3/GCS/R2 (or equivalent)

  • Scanner/validator: malware scan + file type validation

  • Processor: thumbnails/transcode/etc.

  • CDN: serves finalized files

Golden rule

App servers should handle metadata, not file bytes.


DATA MODEL (MAKE STATES EXPLICIT)

A senior-friendly model is explicit and queryable:

  • uploads

    • id

    • owner_user_id

    • entity_type, entity_id (what this attaches to)

    • state: INITIATED | UPLOADING | UPLOADED | SCANNING | PROCESSING | READY | FAILED

    • content_type, size_bytes, checksum (optional but powerful)

    • storage_key

    • failure_code, failure_message

    • created_at, updated_at

Why this matters:

  • UI can render truthfully.

  • Support + on-call can see what happened.

  • You can build dashboards and alerts off state transitions.


API CONTRACT (UI ↔ BACKEND)

Create upload intent

  • POST /uploads/intents

    • body: { entityType, entityId, filename, contentType, sizeBytes }

    • returns:

      • uploadId

      • presignedUrl (or multipart URLs)

      • constraints (max size, allowed types)

      • expiresAt

Report completion (optional)

Some teams add:

  • POST /uploads/{uploadId}/complete

This is useful when storage events are delayed/unavailable.

Query status

  • GET /uploads/{uploadId}{ state, failureCode, downloadUrl? }

Contract rules:

  • State is authoritative.

  • Errors are typed (e.g., FILE_TOO_LARGE, UNSUPPORTED_TYPE, MALWARE_DETECTED, AUTH_DENIED).

  • READY implies safe to serve.


FRONTEND STATE MACHINE (STOP LYING TO USERS)

A robust UI has explicit states:

Senior-level UX behaviors:

  • Preserve progress on transient failures when possible.

  • Retry with the same uploadId (idempotency).

  • Show “uploaded, processing…” separately from “uploading…”

  • Handle navigation/reload: the UI should rehydrate from GET /uploads/{id}.


FAILURE MODES (THE REAL INTERVIEW)

Map failure modes across boundaries:

UI / network failures

  • user closes tab mid-upload

  • flaky wifi

  • duplicate clicks

Mitigations:

  • resumable/multipart uploads for large files

  • idempotent intent creation (client-generated key)

Storage failures

  • presigned URL expired

  • partial multipart upload

Mitigations:

  • short expiry + refresh path

  • server-side cleanup of abandoned uploads

Async pipeline failures

  • event not delivered

  • worker crash mid-processing

  • duplicates (at-least-once)

Mitigations:

  • idempotent workers

  • state transitions with compare-and-swap semantics

  • DLQ + replay

Security failures

  • uploading executable disguised as image

  • public access to private files

  • SSRF-like patterns via URL-based upload (avoid)

Mitigations:

  • validate by magic bytes, not filename

  • serve via authenticated download endpoint or signed CDN URLs

  • scan before READY


OBSERVABILITY (MAKE IT DEBUGGABLE IN 5 MINUTES)

Minimum viable telemetry:

  • Metrics:

    • intent created count

    • upload READY count

    • failure count by failure_code

    • time-to-ready histogram

  • Logs:

    • include uploadId, entityId, storageKey
  • Traces:

    • intent creation

    • status queries

    • worker processing spans

Senior rule:

If you can’t answer “why is this stuck?” quickly, you didn’t finish the feature.


ROLLOUT PLAN (SHIP SAFELY)

  • Feature flag the new upload flow.

  • Shadow-write upload records first (observe states).

  • Gradually increase percentage.

  • Keep rollback path (old flow) until error budgets are stable.


EXERCISES (PRACTICE LIKE A SENIOR)

  1. Draw the full flow and label all boundaries (UI/API/storage/worker/CDN).

  2. Define a complete error taxonomy for uploads.

  3. Write a one-page “failure mode map” and list mitigations.

  4. Propose a latency budget: time-to-first-progress, time-to-bytes-complete, time-to-ready.

  5. Decide: do you need multipart/resumable uploads? What size threshold triggers it?


🏁 END — FULLSTACK SYSTEM DESIGN CASE STUDY 1