SECTION 0 — WHY THIS CHAPTER EXISTS (Senior Fullstack Reality)
Senior fullstack engineers don’t just “build screens” or “ship endpoints.”
You build end-to-end systems where failures, latency, correctness bugs, and security issues cross boundaries:
-
UI state ↔ API contracts
-
API correctness ↔ data invariants
-
Storage ↔ performance and cost
-
Async pipelines ↔ reliability
-
Observability ↔ speed of recovery
This chapter gives you a reusable fullstack design loop and one deep case study you can pattern-match onto almost every product feature.
SECTION 1 — THE FULLSTACK DESIGN LOOP (Staff-ish Thinking, Senior Execution)
Use this loop before you write code:
1) Define the UX + user promises
2) Define invariants (what must never be violated)
3) Define contracts (UI↔API, service↔DB, async boundaries)
4) Design failure modes first
5) Allocate latency + reliability budgets
6) Make it observable
7) Choose rollout + rollback strategy
If you do this consistently, you stop shipping “features” and start shipping systems that survive production.
SECTION 2 — CASE STUDY: FILE UPLOAD PIPELINE (The Senior Fullstack Trapdoor)
File upload is deceptively simple.
It is one of the most common places teams accidentally ship:
-
security vulnerabilities
-
massive infra costs
-
poor UX (stuck uploads, missing progress, phantom success)
-
unreliable async processing (lost jobs, duplicate processing)
-
hard-to-debug failures
We will design:
UI Upload → Storage → Validation/Scan → Processing → Delivery
SECTION 3 — REQUIREMENTS (FUNCTIONAL + NON-FUNCTIONAL)
Functional requirements
-
Users can upload files from web UI.
-
Show upload progress and completion state.
-
Support retry on failure.
-
Uploaded files become available as an attachment on an entity (e.g., ticket, message, profile).
-
Optional: generate derived artifacts (thumbnails, previews, transcoding).
Non-functional requirements (this is where seniors win)
-
Security: prevent arbitrary upload abuse, malware, and unauthorized access.
-
Correctness: no “phantom uploaded” states; avoid duplicates on retry.
-
Scalability: handle spikes (batch uploads, large files).
-
Cost: avoid routing bytes through app servers.
-
Observability: explain failures quickly.
-
UX: consistent states (pending/uploading/processing/ready/failed).
SECTION 4 — KEY INVARIANTS (THE RULES YOU MUST PROTECT)
These invariants prevent the common disasters:
-
An attachment is either not present, or points to a verified stored object.
-
A file is never marked “ready” until validation/scan completes successfully.
-
Uploads are idempotent from the UI perspective (retry doesn’t create multiple attachments).
-
Authorization is enforced at access time, not “best effort” via obscurity.
SECTION 5 — ARCHITECTURE (DON’T SEND BYTES THROUGH YOUR API)
High-level components
-
Frontend: upload UI + state machine
-
API: issues upload intents, tracks state, enforces auth
-
Object storage: S3/GCS/R2 (or equivalent)
-
Scanner/validator: malware scan + file type validation
-
Processor: thumbnails/transcode/etc.
-
CDN: serves finalized files
Golden rule
App servers should handle metadata, not file bytes.
Recommended flow (presigned upload)
(1) UI -> API: POST /uploads/intents
(2) API -> UI: uploadId + presignedUrl(s) + constraints
(3) UI -> Storage: PUT bytes (direct)
(4) Storage -> Event: object-created
(5) Worker: scan/validate -> update upload state
(6) Worker: process artifacts -> update state to READY
(7) UI polls or subscribes -> shows READY + download URL
SECTION 6 — DATA MODEL (MAKE STATES EXPLICIT)
A senior-friendly model is explicit and queryable:
-
uploads-
id -
owner_user_id -
entity_type,entity_id(what this attaches to) -
state:INITIATED | UPLOADING | UPLOADED | SCANNING | PROCESSING | READY | FAILED -
content_type,size_bytes,checksum(optional but powerful) -
storage_key -
failure_code,failure_message -
created_at,updated_at
-
Why this matters:
-
UI can render truthfully.
-
Support + on-call can see what happened.
-
You can build dashboards and alerts off state transitions.
SECTION 7 — API CONTRACT (UI ↔ BACKEND)
Create upload intent
-
POST /uploads/intents-
body:
{ entityType, entityId, filename, contentType, sizeBytes } -
returns:
-
uploadId -
presignedUrl(or multipart URLs) -
constraints(max size, allowed types) -
expiresAt
-
-
Report completion (optional)
Some teams add:
POST /uploads/{uploadId}/complete
This is useful when storage events are delayed/unavailable.
Query status
GET /uploads/{uploadId}→{ state, failureCode, downloadUrl? }
Contract rules:
-
State is authoritative.
-
Errors are typed (e.g.,
FILE_TOO_LARGE,UNSUPPORTED_TYPE,MALWARE_DETECTED,AUTH_DENIED). -
READY implies safe to serve.
SECTION 8 — FRONTEND STATE MACHINE (STOP LYING TO USERS)
A robust UI has explicit states:
IDLE
-> SELECTED
-> UPLOADING (progress)
-> UPLOADED (bytes done)
-> PROCESSING (server-side)
-> READY
-> FAILED (retry)
Senior-level UX behaviors:
-
Preserve progress on transient failures when possible.
-
Retry with the same
uploadId(idempotency). -
Show “uploaded, processing…” separately from “uploading…”
-
Handle navigation/reload: the UI should rehydrate from
GET /uploads/{id}.
SECTION 9 — FAILURE MODES (THE REAL INTERVIEW)
Map failure modes across boundaries:
UI / network failures
-
user closes tab mid-upload
-
flaky wifi
-
duplicate clicks
Mitigations:
-
resumable/multipart uploads for large files
-
idempotent intent creation (client-generated key)
Storage failures
-
presigned URL expired
-
partial multipart upload
Mitigations:
-
short expiry + refresh path
-
server-side cleanup of abandoned uploads
Async pipeline failures
-
event not delivered
-
worker crash mid-processing
-
duplicates (at-least-once)
Mitigations:
-
idempotent workers
-
state transitions with compare-and-swap semantics
-
DLQ + replay
Security failures
-
uploading executable disguised as image
-
public access to private files
-
SSRF-like patterns via URL-based upload (avoid)
Mitigations:
-
validate by magic bytes, not filename
-
serve via authenticated download endpoint or signed CDN URLs
-
scan before READY
SECTION 10 — OBSERVABILITY (MAKE IT DEBUGGABLE IN 5 MINUTES)
Minimum viable telemetry:
-
Metrics:
-
intent created count
-
upload READY count
-
failure count by
failure_code -
time-to-ready histogram
-
-
Logs:
- include
uploadId,entityId,storageKey
- include
-
Traces:
-
intent creation
-
status queries
-
worker processing spans
-
Senior rule:
If you can’t answer “why is this stuck?” quickly, you didn’t finish the feature.
SECTION 11 — ROLLOUT PLAN (SHIP SAFELY)
-
Feature flag the new upload flow.
-
Shadow-write upload records first (observe states).
-
Gradually increase percentage.
-
Keep rollback path (old flow) until error budgets are stable.
SECTION 12 — EXERCISES (PRACTICE LIKE A SENIOR)
-
Draw the full flow and label all boundaries (UI/API/storage/worker/CDN).
-
Define a complete error taxonomy for uploads.
-
Write a one-page “failure mode map” and list mitigations.
-
Propose a latency budget: time-to-first-progress, time-to-bytes-complete, time-to-ready.
-
Decide: do you need multipart/resumable uploads? What size threshold triggers it?