PART V (i) — Data Correctness Across UI + DB (Idempotency, Concurrency, Reconciliation)
CORRECTNESS IS USER TRUST
Many "bugs" are correctness failures:
-
double submit
-
stale reads
-
phantom success
-
lost updates
Senior fullstack correctness means you design invariants and enforce them at boundaries.
Why correctness matters at the boundary: The UI and the DB are separated by a network. The network drops packets, times out, and duplicates requests. Users double-click, refresh mid-request, and switch tabs. Your job is to ensure that regardless of these failures, the system converges to a correct state and the user sees the truth.
EXACTLY-ONCE IS MOSTLY A LIE
Distributed systems theory: In a system with networks, retries, and multiple nodes, true exactly-once delivery is impossible without consensus protocols (e.g., two-phase commit, Paxos). Message queues, webhooks, and HTTP retries all introduce at-least-once semantics. The CAP theorem and network partitions make strong consistency expensive; most systems choose availability and eventual consistency.
Why at-least-once is the default: When a client sends a request and gets no response (timeout, network drop), it cannot know if the server processed it. Retrying is safe for reads (idempotent) but dangerous for writes. Hence: design writes to be idempotent.
| Semantics | Behavior | Use Case |
|---|---|---|
| At-most-once | Fire and forget; no retries. May lose messages. | Metrics, non-critical logs |
| At-least-once | Retry until ack; duplicates possible. | Most APIs, webhooks, queues |
| Effectively-once | At-least-once + idempotent processing. User sees one outcome. | Payments, orders, mutations |
Senior rule:
You cannot eliminate duplicates at the transport layer. You must make your handlers idempotent so duplicates produce the same result.
So you need:
-
idempotency keys
-
dedupe tables
-
idempotent consumers
Dedupe table design: Store (user_id, idempotency_key) → (request_hash, response, created_at). Add TTL or periodic cleanup for keys older than 24–72 hours. Use unique constraint to prevent race conditions on first insert.
IDEMPOTENCY KEYS (WHERE USERS RETRY)
Use idempotency for:
-
checkout
-
uploads
-
"create" operations from flaky networks
Rules:
-
scope keys by user/tenant
-
store request hash
-
return the original result for repeats
Flow diagram:
┌─────────┐ POST /orders ┌─────────────┐ Idempotency-Key: uuid-123
│ Client │ ──────────────────► │ Server │
└─────────┘ └──────┬──────┘
│ │
│ ▼
│ ┌───────────────────┐
│ │ Key exists in DB? │
│ └─────────┬─────────┘
│ │
│ ┌────────────────┼────────────────┐
│ │ YES │ NO │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Return cached │ │ Hash match? │ │
│ │ 201 + body │ └──────┬───────┘ │
│ └──────────────┘ │ │
│ ┌─────────┼─────────┐ │
│ │ YES │ NO │ │
│ ▼ ▼ │ │
│ ┌──────────┐ ┌──────────┐ │ │
│ │ Return │ │ 409 │ │ │
│ │ cached │ │ Conflict │ │ │
│ └──────────┘ └──────────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌──────────────────┐ │
│ │ Execute mutation │ │
│ │ Store key+result │ │
│ │ Return 201 │ │
│ └──────────────────┘ │
◄──────────────────────────────────────────────┘
Server-side idempotency check (pseudocode):
def handle_create_order(request):
key = request.headers["Idempotency-Key"]
user_id = request.user.id
body_hash = hash(request.body)
existing = db.get_idempotent_record(user_id, key)
if existing:
if existing.body_hash != body_hash:
return 409, "Idempotency key reused with different body"
return existing.status_code, existing.response_body # Return original result
with db.transaction():
result = execute_order_creation(request)
db.store_idempotent_record(user_id, key, body_hash, 201, result)
return 201, result
Key generation strategies:
| Strategy | Pros | Cons |
|---|---|---|
| Client-generated UUID | Simple, no collision risk | Client must persist across retries |
| Hash-based (method + path + body) | Deterministic, no client state | Same body retry = same key (good) |
| Composite (user + action + timestamp bucket) | Good for high-volume idempotent reads | Time-window collisions possible |
Senior rule:
Never let the client reuse an idempotency key with a different request body. Return 409 and force them to fix it.
Key retention: Keep idempotency records for at least 24 hours. Stripe recommends 24h; financial systems may need 72h+. After that, safe to purge; retries with old keys will get 404 and can retry with a new key.
Idempotency for GET: GET is naturally idempotent. No key needed. For POST/PUT/PATCH/DELETE that create or mutate state, always consider idempotency.
Partial failure handling: If your mutation spans multiple steps (e.g., create order + charge card + send email), make each step idempotent. If step 2 fails after step 1 succeeds, retrying the whole flow should not double-charge or double-email.
Saga pattern for multi-step: For distributed transactions, use sagas: each step has a compensating action. Order created → charge fails → compensate by canceling order. Design compensations before you need them.
OPTIMISTIC CONCURRENCY (STOP LOST UPDATES)
Pattern:
-
rows have
versionorupdated_at -
client sends
If-Match/ version -
server rejects stale writes (409)
Conflict flow (sequence diagram):
User A Client A Server DB Client B User B
│ │ │ │ │ │
│ Edit title │ │ │ │ │
│──────────────► │ PUT /doc/1 │ │ │ │
│ │ If-Match: v2 │ │ │ │
│ │───────────────►│ UPDATE │ │ │
│ │ │ WHERE v=2 │ │ │
│ │ │─────────────►│ │ │
│ │ │ OK (1 row) │ │ │
│ │ 200 + v3 │◄─────────────│ │ │
│ │◄───────────────│ │ │ │
│ │ │ │ Edit body │ │
│ │ │ │◄───────────────│◄─────────────│
│ │ │ PUT /doc/1 │ │ │
│ │ │ If-Match: v2 │ (stale!) │ │
│ │ │◄─────────────│ │ │
│ │ │ UPDATE │ │ │
│ │ │ WHERE v=2 │ │ │
│ │ │─────────────►│ 0 rows │ │
│ │ │ 409 Conflict│ │ │
│ │ │─────────────►│ │ │
│ │ │ │ 409 + current │ │
│ │ │ │ doc (v3) │ │
│ │ │ │───────────────►│─────────────►│
│ │ │ │ [Merge/refresh UI] │
ETag example:
GET /doc/123
ETag: "abc123"
PUT /doc/123
If-Match: "abc123"
{"title": "Updated"}
→ 200 OK, ETag: "def456"
PUT /doc/123
If-Match: "abc123" ← Stale
{"title": "Another update"}
→ 412 Precondition Failed
Pessimistic vs optimistic tradeoffs:
| Approach | Best for | Pros | Cons |
|---|---|---|---|
| Optimistic | Low conflict rate, read-heavy | No locks, scales well | 409 handling in UI |
| Pessimistic | High conflict rate, critical sections | No lost updates, simple | Locks, deadlocks, lower throughput |
UI handles:
-
refresh and retry
-
conflict resolution where needed
Implementation detail: Use SELECT ... FOR UPDATE only when you truly need pessimistic locking (e.g., inventory decrement). For most CRUD, optimistic locking with version/ETag is sufficient and avoids deadlocks.
AUDIT TRAILS (WHEN IT MUST BE PROVABLE)
If money, permissions, or critical data is involved:
-
append-only audit events
-
who/what/when
-
correlation IDs
Schema example for audit events:
CREATE TABLE audit_events (
id BIGSERIAL PRIMARY KEY,
event_id UUID UNIQUE NOT NULL,
timestamp TIMESTAMPTZ NOT NULL DEFAULT now(),
actor_id VARCHAR(255),
actor_type VARCHAR(50), -- user, service, system
action VARCHAR(100), -- order.created, permission.revoked
resource_type VARCHAR(100),
resource_id VARCHAR(255),
old_value JSONB,
new_value JSONB,
correlation_id VARCHAR(255),
ip_address INET,
user_agent TEXT
);
CREATE INDEX idx_audit_resource ON audit_events(resource_type, resource_id);
CREATE INDEX idx_audit_timestamp ON audit_events(timestamp);
CREATE INDEX idx_audit_correlation ON audit_events(correlation_id);
Event sourcing connection: Audit trails are a form of append-only event log. Full event sourcing uses the log as the source of truth; audit trails are typically additive to a primary DB.
Compliance requirements:
| Regulation | Audit requirement |
|---|---|
| GDPR | Log access to personal data; support right to erasure (anonymize in audit) |
| SOC2 | Access controls, change management, evidence of monitoring |
| PCI-DSS | Access to cardholder data, all admin actions |
| HIPAA | Access to PHI, modifications, disclosures |
Senior rule:
Never update or delete audit records. Append only. Tampering destroys legal value.
Retention and archival: Define retention (e.g., 7 years for financial). Archive old events to cold storage; keep indexes for search. Anonymize PII in audit logs when required by GDPR (replace with hash or "REDACTED").
USER-VISIBLE CORRECTNESS PATTERNS
-
disable submit buttons on action
-
show processing vs success explicitly
-
rehydrate state from server after reload
-
avoid optimistic UI for irreversible actions unless you have compensations
Additional patterns:
-
Skeleton loaders: Show layout placeholders while fetching. Avoids layout shift and "flash of wrong content" when data arrives.
-
Confirm dialogs for destructive actions: "Delete account?", "Cancel subscription?" — force explicit confirmation. Reduces accidental double-clicks and fat-finger errors.
-
Realtime sync indicators: "Saved" / "Syncing..." / "Offline" — users trust the UI when they see state. Use WebSocket or polling to show server truth.
-
Optimistic updates with rollback: For low-risk edits (e.g., todo toggle), update UI immediately, then sync. On failure, revert and show toast: "Couldn't save. Retry?"
Debouncing rapid mutations: User types in a search box; each keystroke triggers API. Debounce 300ms so you send one request per "pause." Same for auto-save: debounce so you don't flood the server on rapid edits.
Loading states: Distinguish "loading" (spinner) from "empty" (no data yet) from "error" (failed to load). Show retry button on error. Never show "empty" when you meant "loading."
Senior rule:
Phantom success (UI says "done" but server failed) destroys trust. Prefer explicit failure over silent lies.
RECONCILIATION JOBS (THE SAFETY NET)
If external systems exist (payments, email providers, storage):
-
periodically reconcile internal vs external truth
-
auto-heal safe mismatches
-
alert on unsafe mismatches
Concrete reconciliation job example: Stripe payments vs internal orders
def reconcile_stripe_orders():
"""Run hourly. Compare Stripe charges vs our orders table."""
stripe_charges = stripe.Charge.list(created={"gte": last_run})
our_orders = Order.where(created_at__gte=last_run, status="paid")
for charge in stripe_charges:
order = our_orders.get(payment_id=charge.id)
if not order:
alert("ORPHAN_STRIPE_CHARGE", charge.id) # We charged but have no order
elif order.amount_cents != charge.amount:
alert("AMOUNT_MISMATCH", order.id, charge.id)
elif order.status != "paid":
auto_heal(order, status="paid") # Stripe says paid; we missed webhook
for order in our_orders:
if order.status == "paid" and not stripe_charge_exists(order.payment_id):
alert("ORPHAN_ORDER_PAID", order.id) # We think paid, Stripe disagrees
Scheduling and alerting patterns:
- Cron: Run daily or hourly. Use idempotent jobs; overlap is OK.
- Dead letter queue: Failed reconciliation rows → DLQ → alert + manual review.
- Slack/PagerDuty: Alert on mismatch count > threshold. Include sample IDs for debugging.
- Dashboard: Reconciliation status, last run, mismatch count. Makes drift visible.
Reconciliation job invariants: Define what "correct" means. Example: "For every Stripe charge with status=paid, we have exactly one order with status=paid and matching amount." Document invariants in the job; fail loudly when violated.
Senior rule:
Reconciliation is your safety net when webhooks fail, queues drop messages, or external APIs lie. Don't ship payments without it.
EVENTUAL CONSISTENCY UI PATTERNS
When data propagates asynchronously (replicas, caches, multi-region), the UI must handle stale data gracefully.
When eventual consistency matters: Read replicas lag behind primary. CDN edge caches serve stale content. Multi-region writes propagate asynchronously. The UI cannot assume immediate consistency; design for it.
Stale-while-revalidate (SWR):
- Show cached/stale data immediately.
- Fetch fresh data in background.
- Update UI when fresh data arrives.
- User sees something fast; correctness arrives shortly after.
// Conceptual SWR pattern
const { data, isValidating } = useSWR("/api/balance", fetcher, {
revalidateOnFocus: true,
dedupingInterval: 5000,
});
// data may be stale; isValidating indicates background refresh
Optimistic updates with rollback:
- Update UI immediately.
- Send mutation to server.
- On success: keep UI state.
- On failure: revert UI, show error, optionally retry.
Conflict resolution UX:
- When 409 or merge conflict: show both versions.
- "Your version" vs "Current version" with merge/discard/overwrite options.
- For collaborative docs: operational transform or CRDTs. For simple forms: last-write-wins with warning.
Version vectors in UI: When showing "current" vs "yours," include timestamps or version numbers. "Someone else updated this 2 minutes ago. Overwrite?" helps users make informed choices.
Senior rule:
Never show "wrong" data without a path to "right." Stale is OK if you're fetching. Phantom success is not.
TESTING CORRECTNESS
Property-based testing for idempotency:
- Generate random request bodies and idempotency keys.
- Send the same (key, body) N times.
- Assert: all responses identical; DB state identical after 1 or N calls.
# Hypothesis-style property test
@given(st.builds(CreateOrderRequest))
def test_create_order_idempotent(req):
key = str(uuid4())
results = [api.create_order(req, key) for _ in range(5)]
assert all(r.status == 201 for r in results)
assert len(set(r.body["order_id"] for r in results)) == 1
Concurrency test harnesses:
- Run N clients updating the same resource concurrently.
- Assert: final state is valid; no lost updates (version/ETag checks).
- Use threads or async; simulate network delay.
Chaos testing for data integrity:
- Kill DB mid-transaction; restart; assert no partial writes.
- Drop webhook delivery; run reconciliation; assert drift detected.
- Simulate clock skew; assert audit timestamps still ordered.
Integration test for full flow: End-to-end test: create order with idempotency key, retry 3x, assert single order. Add concurrent updates from 2 clients, assert one gets 409 and final state is valid.
Test data isolation: Use unique IDs per test (UUID, timestamp). Avoid shared state. Clean up after test or use transactional rollback. Flaky tests from shared DB state waste time.
Senior rule:
Correctness tests must simulate failure. Happy path tests miss the bugs that matter.
EXERCISES
-
Pick one create flow and define idempotency behavior.
-
Add an optimistic concurrency contract to one update endpoint.
-
Propose a reconciliation job for an external dependency.
-
Implement a property-based test for your idempotency key handling.
-
Design conflict resolution UX for a shared document edit flow.
APPENDIX — QUICK REFERENCE
| Problem | Solution |
|---|---|
| Double submit | Idempotency keys |
| Stale read | Revalidate; SWR |
| Lost update | Optimistic concurrency (version/ETag) |
| Phantom success | Disable button; show explicit success/failure |
| Webhook duplicate | Idempotent handler; dedupe by delivery ID |
| External system drift | Reconciliation job |
| Audit requirement | Append-only events; who/what/when |