Skip to main content

Chapter 28: Property-Based Testing


Learning Objectives

By the end of this chapter, you will be able to:

  • Define property-based testing and contrast it with example-based testing
  • Identify when to use property-based testing in SDD: data model invariants, API idempotency, state machine transitions, mathematical properties
  • Derive properties from specifications: constraints, NFRs, data models
  • Use tools such as fast-check (JS/TS), Hypothesis (Python), and QuickCheck (Haskell)
  • Write property-based tests for a running project through a hands-on tutorial
  • Combine property-based tests with spec-driven tests
  • Apply common properties: roundtrip, idempotency, commutativity, invariant preservation
  • Recognize when property-based testing catches bugs that example tests miss

What Is Property-Based Testing?

Property-based testing verifies invariants—properties that must hold for any valid input—instead of verifying specific examples. Instead of "when input is 5, output is 25," you assert "for any non-negative integer n, output is n²." The testing framework generates many random inputs and checks that the property holds for all of them.

Properties vs. Examples

Example-BasedProperty-Based
"user 1 has ID abc""user IDs are always unique"
"sort([3,1,2]) = [1,2,3]""for any list L, sort(L) is sorted and has same elements as L"
"encode then decode 'hello' = 'hello'""for any string s, decode(encode(s)) = s"
"POST then GET returns same data""for any valid bookmark B, create(B) then get(id) returns equivalent B"

Example-based tests verify a few hand-picked cases. Property-based tests verify that a property holds for a large (often hundreds or thousands) of generated inputs. If the property fails, the framework typically shrinks the failing input to a minimal example—e.g., instead of "fails for list of 1000 elements," it finds "fails for list [0, -1]."

Why Property-Based Testing Matters for SDD

Specifications often state invariants:

  • "Bookmarks are always user-scoped" → property: for any bookmark, bookmark.userId matches the authenticated user
  • "IDs are unique" → property: for any set of created resources, no duplicate IDs
  • "Response time < 200ms for any valid input" → property: for any valid request, latency is under threshold
  • "Email format is always valid" → property: for any stored user, email matches regex

These invariants are natural candidates for property-based tests. The spec says "always" or "for any"—property-based testing is the automated way to verify that.


When to Use Property-Based Testing in SDD

1. Data Model Invariants

Spec: "User IDs are UUID v4. Email is valid format. CreatedAt is never in the future."

Property tests: Generate random user-like data; validate that after "sanitization" or "creation," all invariants hold. Or generate invalid data; assert rejection.

2. API Idempotency

Spec: "DELETE /bookmarks/:id is idempotent. First delete returns 204; second delete returns 404."

Property test: For any bookmark ID, call DELETE twice. First call succeeds (204 or 404 if already gone); second call returns 404. No side effects after first delete.

3. State Machine Transitions

Spec: "Order can transition: draft → submitted → paid → shipped. Cannot skip states."

Property test: For any valid sequence of transitions, the final state is reachable and consistent. For any invalid sequence, the system rejects it.

4. Mathematical Properties

Spec: "Pagination: offset + limit ≤ total. Page N has at most limit items."

Property test: For any valid (page, limit, total), the returned items count ≤ limit, and offset is correct.

5. Roundtrip / Serialization

Spec: "Bookmark can be serialized to JSON and deserialized back without loss."

Property test: For any valid bookmark B, JSON.parse(JSON.stringify(B)) is equivalent to B (for relevant fields).

6. Commutativity / Associativity

Spec: "Merging two bookmark lists is commutative: merge(A,B) = merge(B,A)."

Property test: For any lists A and B, merge(A,B) and merge(B,A) produce equivalent results.


Specification-Derived Properties

In SDD, properties flow from the specification. Extract invariants and turn them into property tests.

From Constraints

Spec: "Bookmarks are always user-scoped. A user cannot access another user's bookmarks."

Property: For any user U1, U2 (U1 ≠ U2) and bookmark B belonging to U1, when U2 requests B, the result is 403 or empty.

Property test: Generate two users, create bookmark for U1, assert U2 cannot access it.

From NFRs (Non-Functional Requirements)

Spec: "Response time < 200ms for any valid request under normal load."

Property: For any valid request R, latency(handle(R)) < 200.

Property test: Generate many valid requests; measure latency; assert all under 200ms. (May need to relax for flaky environments; use percentiles.)

From Data Models

Spec: "Email format: local@domain.tld. No spaces. Max 254 chars."

Property: For any string s that passes validation, s matches the email regex and length ≤ 254.

Property test: Use a generator for valid emails (or invalid ones); assert validation accepts/rejects correctly.


Tools: fast-check, Hypothesis, QuickCheck

fast-check (JavaScript/TypeScript)

  • API: fc.assert(fc.property(generators..., predicate))
  • Generators: fc.string(), fc.integer(), fc.array(), fc.record(), fc.uuid(), etc.
  • Shrinking: Automatic; finds minimal failing case
  • Use: Node and browser

Example:

import * as fc from 'fast-check';

fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
const sorted = [...arr].sort((a, b) => a - b);
for (let i = 1; i < sorted.length; i++) {
expect(sorted[i]).toBeGreaterThanOrEqual(sorted[i - 1]);
}
expect(sorted.length).toBe(arr.length);
})
);

Hypothesis (Python)

  • API: @given(decorators with generators) or hypothesis.strategies
  • Generators: st.integers(), st.text(), st.lists(), st.dictionaries(), etc.
  • Shrinking: Automatic
  • Use: With pytest

Example:

from hypothesis import given, strategies as st

@given(st.lists(st.integers()))
def test_sort_produces_sorted_list(arr):
sorted_arr = sorted(arr)
assert all(sorted_arr[i] <= sorted_arr[i+1] for i in range(len(sorted_arr)-1))
assert len(sorted_arr) == len(arr)

QuickCheck (Haskell)

  • Origin: The original property-based testing library
  • API: quickCheck (property)
  • Generators: Arbitrary typeclass
  • Use: Haskell and languages with QuickCheck ports (Erlang, Scala, etc.)

Tutorial: Write Property-Based Tests for the Running Project

This tutorial walks you through writing property-based tests for the Bookmarks feature. You will identify invariants from the spec, write generators, write property assertions, and analyze shrinking on failures.

Step 0: Invariants from the Specification

From the Bookmarks spec:

InvariantSourceProperty
URL is valid (http/https)FR-004For any bookmark B created via API, B.url matches URL regex
Title ≤ 200 charsFR-005For any bookmark B, B.title?.length ≤ 200
IDs are uniqueImplicitFor any two created bookmarks, ids differ
RoundtripImplicitFor any valid bookmark input, create then get returns equivalent data
User-scopedAC-002For any user U, getBookmarks(U) returns only U's bookmarks

We will implement property tests for: URL validation, title length, and roundtrip.

Step 1: Install fast-check

npm install -D fast-check

Step 2: Write Generators for Test Data

Create tests/property/generators.ts:

import * as fc from 'fast-check';

// Valid URL: http or https, non-empty host
export const validUrlArb = fc.webUrl({ validSchemes: ['http', 'https'] });

// Invalid URL: no protocol, or invalid format
export const invalidUrlArb = fc.oneof(
fc.string({ minLength: 1, maxLength: 100 }).filter(s => !s.includes('://')),
fc.constant(''),
fc.constant('ftp://example.com') // wrong scheme
);

// Title: optional, max 200 chars
export const titleArb = fc.option(
fc.string({ minLength: 0, maxLength: 200 }),
{ nil: undefined }
);

// Bookmark input for API
export const bookmarkInputArb = fc.record({
url: validUrlArb,
title: titleArb,
});

Step 3: Property: URL Validation

Create tests/property/bookmark-validation.property.test.ts:

import * as fc from 'fast-check';
import { validateBookmarkInput } from '../../src/bookmarks/validation';
import { validUrlArb, invalidUrlArb, titleArb } from './generators';

describe('Bookmark validation properties', () => {
it('accepts any valid URL (http/https)', () => {
fc.assert(
fc.property(validUrlArb, titleArb, (url, title) => {
const result = validateBookmarkInput({ url, title: title ?? undefined });
expect(result.url).toBe(url);
}),
{ numRuns: 500 }
);
});

it('rejects any invalid URL', () => {
fc.assert(
fc.property(invalidUrlArb, (url) => {
expect(() => validateBookmarkInput({ url })).toThrow();
}),
{ numRuns: 500 }
);
});

it('title is always ≤ 200 chars when provided', () => {
fc.assert(
fc.property(validUrlArb, fc.string({ minLength: 0, maxLength: 200 }), (url, title) => {
const result = validateBookmarkInput({ url, title });
if (result.title) {
expect(result.title.length).toBeLessThanOrEqual(200);
}
}),
{ numRuns: 500 }
);
});
});

Step 4: Property: Roundtrip (Create then Get)

Create tests/property/bookmark-roundtrip.property.test.ts:

import * as fc from 'fast-check';
import { createTestUser, createBookmark, getBookmark } from '../helpers';
import { validUrlArb, titleArb } from './generators';

describe('Bookmark roundtrip property', () => {
it('create then get returns equivalent bookmark for any valid input', async () => {
const user = await createTestUser();

await fc.assert(
fc.asyncProperty(validUrlArb, titleArb, async (url, title) => {
const input = { url, title: title ?? undefined };
const created = await createBookmark(user.token, input);
const fetched = await getBookmark(user.token, created.id);

expect(fetched.id).toBe(created.id);
expect(fetched.url).toBe(created.url);
expect(fetched.title ?? undefined).toBe(input.title ?? undefined);
expect(fetched.createdAt).toBeDefined();
}),
{ numRuns: 50 } // Fewer runs for async/API tests
);
}, 30000); // Timeout for async property
});

Step 5: Run and Analyze Shrinking

Run the property tests:

npm run test:property

If a property fails, fast-check will shrink the input. For example, if "rejects any invalid URL" fails for some edge case, you might see:

Error: Property failed after 3 runs
{ seed: 12345, path: "2:0:0:0:0:0", end: "..."
Counterexample: [""]
Shrunk 5 times

The counterexample "" is the minimal failing input. You might have assumed empty string was handled, but the validator allows it. Fix the validator (or the generator if empty should be excluded).

Step 6: Combine with Spec-Driven Tests

Property-based tests complement example-based spec-driven tests:

Spec-Driven (Example)Property-Based
AC-001: POST with valid URL returns 201For any valid URL, POST returns 201 and bookmark has that URL
Edge case: empty URL returns 400For any invalid URL (including empty), POST returns 400
AC-002: GET returns user's bookmarksFor any user U, GET returns only U's bookmarks (invariant)

Run both. Example tests give you specific, traceable coverage. Property tests give you broad invariant coverage. Together they provide stronger assurance.


Common Properties: Roundtrip, Idempotency, Commutativity, Invariant Preservation

Roundtrip

Property: decode(encode(x)) === x (or equivalent for your domain)

Use when: Serialization, hashing (for non-lossy), API create→get.

Example: JSON.parse(JSON.stringify(bookmark)) preserves bookmark fields.

Idempotency

Property: f(f(x)) === f(x) — applying the operation twice is the same as once.

Use when: DELETE, PUT (replace), "mark as read."

Example: DELETE /bookmarks/:id. First call: 204. Second call: 404. No duplicate side effects.

Commutativity

Property: f(a, b) === f(b, a) — order does not matter.

Use when: Merge operations, set unions, combining filters.

Example: merge(bookmarksA, bookmarksB) equivalent to merge(bookmarksB, bookmarksA).

Invariant Preservation

Property: For any valid input, output satisfies invariant I.

Use when: Data model constraints, business rules.

Example: For any created bookmark, bookmark.userId === authenticatedUser.id.

Inverse Operations

Property: g(f(x)) === x — one operation undoes the other.

Use when: Encode/decode, encrypt/decrypt, compress/decompress.

Example: decrypt(encrypt(plaintext)) === plaintext.


When Property-Based Testing Catches Bugs That Example Tests Miss

Example 1: Unicode and Edge Characters

Example test: "hello" → encode → decode → "hello" ✓

Property test: For any string s, decode(encode(s)) = s.

Bug found: Strings with emoji or null bytes break the roundtrip. Example test never tried them. Property test generates them and finds the bug.

Example 2: Numeric Boundaries

Example test: Pagination with page=1, limit=10 ✓

Property test: For any (page, limit, total) with valid ranges, returned count ≤ limit.

Bug found: When total=0 and page=1, the implementation returns 500 or wrong count. Example test never tried total=0.

Example 3: Concurrency and Ordering

Example test: Create 3 bookmarks, list returns 3 ✓

Property test: For any set of created bookmarks, list returns exactly those, no duplicates, no extras.

Bug found: Under concurrent creation, duplicate IDs or race conditions. Example test is sequential.

Example 4: Shrinking Reveals Minimal Failure

Property fails for some large input. Shrinking reduces it to a minimal case: e.g., empty array, or single element with a specific value. The minimal case is easier to debug and often reveals a simple logic error (e.g., "forgot to handle empty list").


Combining Property-Based Tests with Spec-Driven Tests

In SDD, you have:

  1. Spec-driven example tests — From acceptance criteria, edge cases, user journeys. Traceable to requirements.
  2. Property-based tests — From invariants. Verify "for any" and "always" claims.

Workflow

  1. Parse spec for invariants ("always," "for any," "never," "must").
  2. Write property tests for those invariants.
  3. Write example tests for acceptance criteria and edge cases.
  4. Run both in CI.
  5. Traceability: Property tests can link to spec sections (e.g., "FR-004: URL validation invariant").

Coverage

  • Spec coverage: Example tests cover requirements (AC-001, AC-002, etc.).
  • Invariant coverage: Property tests cover invariants (unique IDs, user-scoped, roundtrip).

Aim for both. Some requirements are naturally invariant (e.g., "IDs are unique") and get property tests. Others are scenario-based (e.g., "Given X, when Y, then Z") and get example tests.


Debugging Property Test Failures

When a property test fails, the framework provides a counterexample and often a seed. Use these to debug.

Step 1: Reproduce with the Counterexample

The shrink process produces a minimal failing input. Copy it and write a unit test:

it('reproduces property failure', () => {
const input = ''; // Counterexample from property test
expect(() => validateBookmarkInput({ url: input })).toThrow();
});

Run the unit test. It should fail the same way. Now you can debug with a fixed, minimal input.

Step 2: Use the Seed for Reproducibility

fast-check and Hypothesis report a seed when a property fails. Re-run with that seed to get the same generated inputs:

fc.assert(
fc.property(validUrlArb, (url) => { ... }),
{ seed: 12345 } // From failure output
);

This ensures the failure is reproducible in CI or on another machine.

Step 3: Increase Verbosity

Some frameworks support verbose output to see each generated input:

fc.assert(
fc.property(generator, predicate),
{ verbose: 2 } // Log each run
);

This helps when the counterexample alone is not enough to understand the failure.

Step 4: Check the Property

Sometimes the property is wrong, not the implementation. Ask: "Is this property actually required by the spec?" If the spec says "reject invalid URLs" and your property says "reject any string without ://", maybe the spec allows more (e.g., relative URLs). Align the property with the spec.


Common Pitfalls in Property-Based Testing

Pitfall 1: Testing the Wrong Thing

Wrong: Property says "for any input, function doesn't throw." But the spec says "reject invalid input with error." Throwing is correct for invalid input.

Right: Property says "for any valid input, function returns valid output" and "for any invalid input, function throws or returns error."

Pitfall 2: Generators That Don't Cover the Space

Wrong: fc.integer({ min: 1, max: 10 }) — never generates 0 or negative. You miss boundary bugs.

Right: Include edge cases in the generator: 0, -1, max+1, empty string, etc., when they're in the input domain.

Pitfall 3: Async Properties Without Timeout

Wrong: Async property runs forever when the API hangs.

Right: Set a timeout: fc.assert(fc.asyncProperty(...), { timeout: 10000 }) or use test framework timeout.

Pitfall 4: Too Many Runs for Slow Properties

Wrong: 10,000 runs for an async property that hits the API. Takes hours.

Right: Reduce numRuns for slow properties (e.g., 20–50). Use more runs for fast, pure functions.

Pitfall 5: Ignoring Shrink Results

Wrong: "The property failed for some random input" — dismiss and add a filter to exclude it.

Right: The minimal counterexample is valuable. It often reveals a simple bug (empty array, null, boundary value). Fix the root cause.


Hypothesis Deep Dive: Strategies and Composition

Hypothesis (Python) provides powerful strategies. Understanding them helps you write better property tests.

Basic Strategies

st.integers(min_value=0, max_value=100)
st.text(min_size=0, max_size=200, alphabet=st.characters(whitelist_categories=('L', 'N')))
st.lists(st.integers(), min_size=0, max_size=10)
st.dictionaries(st.text(), st.integers())
st.one_of(st.just(1), st.just(2), st.integers())
st.sampled_from(['a', 'b', 'c'])

Composing Strategies

@dataclass
class Bookmark:
url: str
title: Optional[str]

bookmark_strategy = st.builds(
Bookmark,
url=st.urls(schemes=['http', 'https']),
title=st.one_of(st.none(), st.text(max_size=200))
)

@given(bookmark_strategy)
def test_bookmark_roundtrip(bookmark):
assert serialize(deserialize(serialize(bookmark))) == serialize(bookmark)

Conditional Filters

@given(st.integers())
def test_positive_square(n):
assume(n >= 0) # Skip negative; only test non-negative
assert sqrt(n) ** 2 == n

Stateful Testing

Hypothesis can generate sequences of operations for stateful systems:

from hypothesis.stateful import RuleBasedStateMachine, rule

class BookmarkMachine(RuleBasedStateMachine):
def __init__(self):
super().__init__()
self.bookmarks = []

@rule(url=st.urls())
def add_bookmark(self, url):
b = create_bookmark(url)
self.bookmarks.append(b)
assert b.url == url

@rule()
def list_bookmarks(self):
result = list_bookmarks()
assert len(result) == len(self.bookmarks)

When Not to Use Property-Based Testing

Property-based testing is not always the best choice.

When to Prefer Example-Based Tests

  • Traceability: Example tests map 1:1 to acceptance criteria. Property tests are more abstract.
  • Readability: "Given user X, when they do Y, then Z" is clearer for stakeholders than "for any user, P holds."
  • Performance: Property tests can be slow if they hit the network or DB. Example tests are often faster.
  • Determinism: Property tests use randomness. Flaky failures can occur (rare with good frameworks). Example tests are deterministic.
  • Complex setup: If each test needs a unique setup (e.g., specific DB state), property tests can be cumbersome. Example tests are simpler.

Hybrid Approach

Use both:

  • Example tests for: AC-001, AC-002, edge cases, user journeys. Traceable, readable, fast.
  • Property tests for: invariants, roundtrip, idempotency, "for any" claims. Broad coverage, catches edge cases.

Run example tests on every commit. Run property tests on every commit or nightly (if slow). Both contribute to confidence.


Frequently Asked Questions

Q: How many runs should a property test use?
A: Default is often 100–1000. For fast pure functions, 1000–10000. For async/API tests, 20–100. Adjust based on speed and coverage needs.

Q: Property tests are flaky. What do I do?
A: Use a fixed seed for reproducibility. Ensure generators don't produce invalid inputs that your code rejects. Check for timing-dependent behavior (e.g., race conditions). Increase timeout if needed.

Q: Can I use property-based testing in a legacy codebase?
A: Yes. Start with one invariant. Write the property test. It may fail and reveal bugs. Fix the bugs or add a @known_failing marker and track fixes. Gradually add more properties.

Q: How do I test properties that involve external resources (DB, API)?
A: Use fewer runs, longer timeout, and possibly a test database. Or use a fake/simplified implementation for the property tests and keep integration tests for the real thing.

Q: What if the minimal counterexample is huge?
A: The shrink process aims for minimal, but sometimes it's still large. Use the seed to reproduce. Add a unit test with the counterexample. Debug from there. Consider whether the property can be split into smaller properties.


Try With AI

Prompt 1: Invariant Extraction

"I have a feature specification at specs/005-bookmarks/spec.md. Extract all invariants—statements that use 'always,' 'never,' 'for any,' 'must,' or similar. For each invariant, suggest a property-based test in natural language. Output as a table: Invariant | Property Test Description."

Prompt 2: Generator Design

"I need to write property-based tests for bookmark validation. The spec says: URL must be http/https, title max 200 chars. Design fast-check generators for (1) valid bookmark inputs, (2) invalid URL inputs. Show the generator code and explain what each covers."

Prompt 3: Shrinking Analysis

"My property test failed with this counterexample after shrinking: { url: '', title: 'x' }. The property is 'rejects invalid URL.' What does this tell me? How should I fix the validator or the generator? Write the fix."

Prompt 4: Property from NFR

"The spec says 'Response time < 200ms for any valid request.' How would I write a property-based test for this? What are the challenges (flakiness, environment)? Suggest a practical approach with fast-check or Hypothesis."


Practice Exercises

Exercise 1: Invariant to Property

Choose a function or API from your project (e.g., sort, validation, pagination). Identify one invariant it should satisfy. Write a property-based test for it using fast-check or Hypothesis. Run it; if it fails, analyze the shrink output and fix the implementation or the property.

Expected outcome: A passing property test and a brief note on what the property guarantees.

Exercise 2: Roundtrip Property

For a data structure (e.g., bookmark, user, config) that can be serialized to JSON and deserialized, write a roundtrip property: for any valid instance, deserialize(serialize(x)) is equivalent to x. Handle edge cases (empty, optional fields, special characters). Run and fix any failures.

Expected outcome: A roundtrip property test that passes for generated data.

Exercise 3: Compare Example vs. Property

Write an example-based test for "sort produces sorted output" with 3 hand-picked inputs. Then write a property-based test for the same. Intentionally introduce a bug in the sort implementation (e.g., off-by-one). Run both tests. Which catches the bug? Reflect on the difference.

Expected outcome: Both tests (or a write-up showing property test's advantage in catching the bug with generated inputs).


Key Takeaways

  1. Property-based testing verifies invariants for any valid input, not just hand-picked examples. It uses random generation and shrinking to find minimal failing cases.

  2. Use property-based testing for data model invariants, API idempotency, state machines, roundtrip/serialization, commutativity, and NFRs like "for any valid input, P holds."

  3. Derive properties from specifications — look for "always," "never," "for any," "must." These map directly to property tests.

  4. Tools: fast-check (JS/TS), Hypothesis (Python), QuickCheck (Haskell). All provide generators and automatic shrinking.

  5. Combine with spec-driven tests: Example tests for acceptance criteria; property tests for invariants. Both run in CI; both contribute to confidence.

  6. Property-based testing catches bugs that example tests miss: edge characters, boundary values, concurrency, and cases you never thought to try. Shrinking makes failures debuggable.


Chapter Quiz

  1. What is the difference between property-based testing and example-based testing? Give one example of each for the same function.

  2. Name four scenarios in SDD where property-based testing is appropriate. For each, state the invariant you would test.

  3. What is "shrinking" in property-based testing? Why is it useful?

  4. How do you derive properties from a specification? What words or phrases in a spec suggest an invariant?

  5. What is a roundtrip property? Give an example from the Bookmarks domain.

  6. What is an idempotency property? Give an example from an API.

  7. Why might a property-based test catch a bug that an example-based test misses? Give two reasons.

  8. How do property-based tests fit with spec-driven example tests? Should you use one or both, and why?


Back to: Part IX Overview