Skip to main content

PART V (i) — Data Correctness Across UI + DB (Idempotency, Concurrency, Reconciliation)

CORRECTNESS IS USER TRUST

Many "bugs" are correctness failures:

  • double submit

  • stale reads

  • phantom success

  • lost updates

Senior fullstack correctness means you design invariants and enforce them at boundaries.

Why correctness matters at the boundary: The UI and the DB are separated by a network. The network drops packets, times out, and duplicates requests. Users double-click, refresh mid-request, and switch tabs. Your job is to ensure that regardless of these failures, the system converges to a correct state and the user sees the truth.


EXACTLY-ONCE IS MOSTLY A LIE

Distributed systems theory: In a system with networks, retries, and multiple nodes, true exactly-once delivery is impossible without consensus protocols (e.g., two-phase commit, Paxos). Message queues, webhooks, and HTTP retries all introduce at-least-once semantics. The CAP theorem and network partitions make strong consistency expensive; most systems choose availability and eventual consistency.

Why at-least-once is the default: When a client sends a request and gets no response (timeout, network drop), it cannot know if the server processed it. Retrying is safe for reads (idempotent) but dangerous for writes. Hence: design writes to be idempotent.

SemanticsBehaviorUse Case
At-most-onceFire and forget; no retries. May lose messages.Metrics, non-critical logs
At-least-onceRetry until ack; duplicates possible.Most APIs, webhooks, queues
Effectively-onceAt-least-once + idempotent processing. User sees one outcome.Payments, orders, mutations

Senior rule:

You cannot eliminate duplicates at the transport layer. You must make your handlers idempotent so duplicates produce the same result.

So you need:

  • idempotency keys

  • dedupe tables

  • idempotent consumers

Dedupe table design: Store (user_id, idempotency_key) → (request_hash, response, created_at). Add TTL or periodic cleanup for keys older than 24–72 hours. Use unique constraint to prevent race conditions on first insert.


IDEMPOTENCY KEYS (WHERE USERS RETRY)

Use idempotency for:

  • checkout

  • uploads

  • "create" operations from flaky networks

Rules:

  • scope keys by user/tenant

  • store request hash

  • return the original result for repeats

Flow diagram:

┌─────────┐     POST /orders     ┌─────────────┐     Idempotency-Key: uuid-123
│ Client │ ──────────────────► │ Server │
└─────────┘ └──────┬──────┘
│ │
│ ▼
│ ┌───────────────────┐
│ │ Key exists in DB? │
│ └─────────┬─────────┘
│ │
│ ┌────────────────┼────────────────┐
│ │ YES │ NO │
│ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Return cached │ │ Hash match? │ │
│ │ 201 + body │ └──────┬───────┘ │
│ └──────────────┘ │ │
│ ┌─────────┼─────────┐ │
│ │ YES │ NO │ │
│ ▼ ▼ │ │
│ ┌──────────┐ ┌──────────┐ │ │
│ │ Return │ │ 409 │ │ │
│ │ cached │ │ Conflict │ │ │
│ └──────────┘ └──────────┘ │ │
│ │ │ │
│ ▼ │ │
│ ┌──────────────────┐ │
│ │ Execute mutation │ │
│ │ Store key+result │ │
│ │ Return 201 │ │
│ └──────────────────┘ │
◄──────────────────────────────────────────────┘

Server-side idempotency check (pseudocode):

def handle_create_order(request):
key = request.headers["Idempotency-Key"]
user_id = request.user.id
body_hash = hash(request.body)

existing = db.get_idempotent_record(user_id, key)
if existing:
if existing.body_hash != body_hash:
return 409, "Idempotency key reused with different body"
return existing.status_code, existing.response_body # Return original result

with db.transaction():
result = execute_order_creation(request)
db.store_idempotent_record(user_id, key, body_hash, 201, result)
return 201, result

Key generation strategies:

StrategyProsCons
Client-generated UUIDSimple, no collision riskClient must persist across retries
Hash-based (method + path + body)Deterministic, no client stateSame body retry = same key (good)
Composite (user + action + timestamp bucket)Good for high-volume idempotent readsTime-window collisions possible

Senior rule:

Never let the client reuse an idempotency key with a different request body. Return 409 and force them to fix it.

Key retention: Keep idempotency records for at least 24 hours. Stripe recommends 24h; financial systems may need 72h+. After that, safe to purge; retries with old keys will get 404 and can retry with a new key.

Idempotency for GET: GET is naturally idempotent. No key needed. For POST/PUT/PATCH/DELETE that create or mutate state, always consider idempotency.

Partial failure handling: If your mutation spans multiple steps (e.g., create order + charge card + send email), make each step idempotent. If step 2 fails after step 1 succeeds, retrying the whole flow should not double-charge or double-email.

Saga pattern for multi-step: For distributed transactions, use sagas: each step has a compensating action. Order created → charge fails → compensate by canceling order. Design compensations before you need them.


OPTIMISTIC CONCURRENCY (STOP LOST UPDATES)

Pattern:

  • rows have version or updated_at

  • client sends If-Match / version

  • server rejects stale writes (409)

Conflict flow (sequence diagram):

User A          Client A         Server           DB           Client B         User B
│ │ │ │ │ │
│ Edit title │ │ │ │ │
│──────────────► │ PUT /doc/1 │ │ │ │
│ │ If-Match: v2 │ │ │ │
│ │───────────────►│ UPDATE │ │ │
│ │ │ WHERE v=2 │ │ │
│ │ │─────────────►│ │ │
│ │ │ OK (1 row) │ │ │
│ │ 200 + v3 │◄─────────────│ │ │
│ │◄───────────────│ │ │ │
│ │ │ │ Edit body │ │
│ │ │ │◄───────────────│◄─────────────│
│ │ │ PUT /doc/1 │ │ │
│ │ │ If-Match: v2 │ (stale!) │ │
│ │ │◄─────────────│ │ │
│ │ │ UPDATE │ │ │
│ │ │ WHERE v=2 │ │ │
│ │ │─────────────►│ 0 rows │ │
│ │ │ 409 Conflict│ │ │
│ │ │─────────────►│ │ │
│ │ │ │ 409 + current │ │
│ │ │ │ doc (v3) │ │
│ │ │ │───────────────►│─────────────►│
│ │ │ │ [Merge/refresh UI] │

ETag example:

GET /doc/123
ETag: "abc123"

PUT /doc/123
If-Match: "abc123"
{"title": "Updated"}

→ 200 OK, ETag: "def456"

PUT /doc/123
If-Match: "abc123" ← Stale
{"title": "Another update"}

→ 412 Precondition Failed

Pessimistic vs optimistic tradeoffs:

ApproachBest forProsCons
OptimisticLow conflict rate, read-heavyNo locks, scales well409 handling in UI
PessimisticHigh conflict rate, critical sectionsNo lost updates, simpleLocks, deadlocks, lower throughput

UI handles:

  • refresh and retry

  • conflict resolution where needed

Implementation detail: Use SELECT ... FOR UPDATE only when you truly need pessimistic locking (e.g., inventory decrement). For most CRUD, optimistic locking with version/ETag is sufficient and avoids deadlocks.


AUDIT TRAILS (WHEN IT MUST BE PROVABLE)

If money, permissions, or critical data is involved:

  • append-only audit events

  • who/what/when

  • correlation IDs

Schema example for audit events:

CREATE TABLE audit_events (
id BIGSERIAL PRIMARY KEY,
event_id UUID UNIQUE NOT NULL,
timestamp TIMESTAMPTZ NOT NULL DEFAULT now(),
actor_id VARCHAR(255),
actor_type VARCHAR(50), -- user, service, system
action VARCHAR(100), -- order.created, permission.revoked
resource_type VARCHAR(100),
resource_id VARCHAR(255),
old_value JSONB,
new_value JSONB,
correlation_id VARCHAR(255),
ip_address INET,
user_agent TEXT
);

CREATE INDEX idx_audit_resource ON audit_events(resource_type, resource_id);
CREATE INDEX idx_audit_timestamp ON audit_events(timestamp);
CREATE INDEX idx_audit_correlation ON audit_events(correlation_id);

Event sourcing connection: Audit trails are a form of append-only event log. Full event sourcing uses the log as the source of truth; audit trails are typically additive to a primary DB.

Compliance requirements:

RegulationAudit requirement
GDPRLog access to personal data; support right to erasure (anonymize in audit)
SOC2Access controls, change management, evidence of monitoring
PCI-DSSAccess to cardholder data, all admin actions
HIPAAAccess to PHI, modifications, disclosures

Senior rule:

Never update or delete audit records. Append only. Tampering destroys legal value.

Retention and archival: Define retention (e.g., 7 years for financial). Archive old events to cold storage; keep indexes for search. Anonymize PII in audit logs when required by GDPR (replace with hash or "REDACTED").


USER-VISIBLE CORRECTNESS PATTERNS

  • disable submit buttons on action

  • show processing vs success explicitly

  • rehydrate state from server after reload

  • avoid optimistic UI for irreversible actions unless you have compensations

Additional patterns:

  • Skeleton loaders: Show layout placeholders while fetching. Avoids layout shift and "flash of wrong content" when data arrives.

  • Confirm dialogs for destructive actions: "Delete account?", "Cancel subscription?" — force explicit confirmation. Reduces accidental double-clicks and fat-finger errors.

  • Realtime sync indicators: "Saved" / "Syncing..." / "Offline" — users trust the UI when they see state. Use WebSocket or polling to show server truth.

  • Optimistic updates with rollback: For low-risk edits (e.g., todo toggle), update UI immediately, then sync. On failure, revert and show toast: "Couldn't save. Retry?"

Debouncing rapid mutations: User types in a search box; each keystroke triggers API. Debounce 300ms so you send one request per "pause." Same for auto-save: debounce so you don't flood the server on rapid edits.

Loading states: Distinguish "loading" (spinner) from "empty" (no data yet) from "error" (failed to load). Show retry button on error. Never show "empty" when you meant "loading."

Senior rule:

Phantom success (UI says "done" but server failed) destroys trust. Prefer explicit failure over silent lies.


RECONCILIATION JOBS (THE SAFETY NET)

If external systems exist (payments, email providers, storage):

  • periodically reconcile internal vs external truth

  • auto-heal safe mismatches

  • alert on unsafe mismatches

Concrete reconciliation job example: Stripe payments vs internal orders

def reconcile_stripe_orders():
"""Run hourly. Compare Stripe charges vs our orders table."""
stripe_charges = stripe.Charge.list(created={"gte": last_run})
our_orders = Order.where(created_at__gte=last_run, status="paid")

for charge in stripe_charges:
order = our_orders.get(payment_id=charge.id)
if not order:
alert("ORPHAN_STRIPE_CHARGE", charge.id) # We charged but have no order
elif order.amount_cents != charge.amount:
alert("AMOUNT_MISMATCH", order.id, charge.id)
elif order.status != "paid":
auto_heal(order, status="paid") # Stripe says paid; we missed webhook

for order in our_orders:
if order.status == "paid" and not stripe_charge_exists(order.payment_id):
alert("ORPHAN_ORDER_PAID", order.id) # We think paid, Stripe disagrees

Scheduling and alerting patterns:

  • Cron: Run daily or hourly. Use idempotent jobs; overlap is OK.
  • Dead letter queue: Failed reconciliation rows → DLQ → alert + manual review.
  • Slack/PagerDuty: Alert on mismatch count > threshold. Include sample IDs for debugging.
  • Dashboard: Reconciliation status, last run, mismatch count. Makes drift visible.

Reconciliation job invariants: Define what "correct" means. Example: "For every Stripe charge with status=paid, we have exactly one order with status=paid and matching amount." Document invariants in the job; fail loudly when violated.

Senior rule:

Reconciliation is your safety net when webhooks fail, queues drop messages, or external APIs lie. Don't ship payments without it.


EVENTUAL CONSISTENCY UI PATTERNS

When data propagates asynchronously (replicas, caches, multi-region), the UI must handle stale data gracefully.

When eventual consistency matters: Read replicas lag behind primary. CDN edge caches serve stale content. Multi-region writes propagate asynchronously. The UI cannot assume immediate consistency; design for it.

Stale-while-revalidate (SWR):

  • Show cached/stale data immediately.
  • Fetch fresh data in background.
  • Update UI when fresh data arrives.
  • User sees something fast; correctness arrives shortly after.
// Conceptual SWR pattern
const { data, isValidating } = useSWR("/api/balance", fetcher, {
revalidateOnFocus: true,
dedupingInterval: 5000,
});
// data may be stale; isValidating indicates background refresh

Optimistic updates with rollback:

  • Update UI immediately.
  • Send mutation to server.
  • On success: keep UI state.
  • On failure: revert UI, show error, optionally retry.

Conflict resolution UX:

  • When 409 or merge conflict: show both versions.
  • "Your version" vs "Current version" with merge/discard/overwrite options.
  • For collaborative docs: operational transform or CRDTs. For simple forms: last-write-wins with warning.

Version vectors in UI: When showing "current" vs "yours," include timestamps or version numbers. "Someone else updated this 2 minutes ago. Overwrite?" helps users make informed choices.

Senior rule:

Never show "wrong" data without a path to "right." Stale is OK if you're fetching. Phantom success is not.


TESTING CORRECTNESS

Property-based testing for idempotency:

  • Generate random request bodies and idempotency keys.
  • Send the same (key, body) N times.
  • Assert: all responses identical; DB state identical after 1 or N calls.
# Hypothesis-style property test
@given(st.builds(CreateOrderRequest))
def test_create_order_idempotent(req):
key = str(uuid4())
results = [api.create_order(req, key) for _ in range(5)]
assert all(r.status == 201 for r in results)
assert len(set(r.body["order_id"] for r in results)) == 1

Concurrency test harnesses:

  • Run N clients updating the same resource concurrently.
  • Assert: final state is valid; no lost updates (version/ETag checks).
  • Use threads or async; simulate network delay.

Chaos testing for data integrity:

  • Kill DB mid-transaction; restart; assert no partial writes.
  • Drop webhook delivery; run reconciliation; assert drift detected.
  • Simulate clock skew; assert audit timestamps still ordered.

Integration test for full flow: End-to-end test: create order with idempotency key, retry 3x, assert single order. Add concurrent updates from 2 clients, assert one gets 409 and final state is valid.

Test data isolation: Use unique IDs per test (UUID, timestamp). Avoid shared state. Clean up after test or use transactional rollback. Flaky tests from shared DB state waste time.

Senior rule:

Correctness tests must simulate failure. Happy path tests miss the bugs that matter.


EXERCISES

  1. Pick one create flow and define idempotency behavior.

  2. Add an optimistic concurrency contract to one update endpoint.

  3. Propose a reconciliation job for an external dependency.

  4. Implement a property-based test for your idempotency key handling.

  5. Design conflict resolution UX for a shared document edit flow.


APPENDIX — QUICK REFERENCE

ProblemSolution
Double submitIdempotency keys
Stale readRevalidate; SWR
Lost updateOptimistic concurrency (version/ETag)
Phantom successDisable button; show explicit success/failure
Webhook duplicateIdempotent handler; dedupe by delivery ID
External system driftReconciliation job
Audit requirementAppend-only events; who/what/when

🏁 END — DATA CORRECTNESS ACROSS UI + DB