Chapter 8: Context Engineering Fundamentals
Learning Objectives
By the end of this chapter, you will be able to:
- Define context engineering and distinguish it from prompt engineering
- Apply the 4Ds framework: Delegation, Description, Discernment, Diligence
- Manage AI context windows effectively, including token budgets and priority loading
- Understand and mitigate the "lost in the middle" effect
- Design multi-session workflows with context isolation and checkpoints
- Build a context engineering strategy for specification-driven projects
What Is Context Engineering?
Prompt engineering is about crafting the right question. Context engineering is about building the right environment for AI to answer any question well.
As Andrej Karpathy observed: "Context engineering is the art of providing all the context for the task to be plausibly solvable by the LLM."
Think of it this way: a prompt is a single instruction. Context is everything the AI knows when it processes that instruction — the system prompt, loaded files, previous conversation turns, tool descriptions, memory files, and the accumulated patterns from the session.
Prompt Engineering: "Write a user registration endpoint"
(a single instruction)
Context Engineering: System prompt + loaded project files +
coding standards + API conventions +
database schema + existing patterns +
security requirements + test patterns +
"Write a user registration endpoint"
(a complete working environment)
The same prompt produces radically different code depending on the context. With poor context, AI generates generic code. With engineered context, AI generates code that matches your project's patterns, respects your constraints, and integrates with your existing architecture.
The 4Ds Framework
The 4Ds provide a systematic approach to context engineering:
D1: Delegation — What Should AI Do?
Delegation is the art of knowing which tasks to give to AI and how to structure the handoff.
High-value delegation (AI excels):
- Implementing features from detailed specifications
- Generating tests from acceptance criteria
- Refactoring code while preserving behavior
- Converting between formats (Markdown to code, spec to test)
- Boilerplate generation (routes, models, validation)
Low-value delegation (humans should lead):
- Defining business requirements
- Making architectural trade-off decisions
- Setting project priorities
- Evaluating user experience quality
- Writing specifications (though AI can help draft and review)
The delegation rule: Delegate implementation, retain specification authority.
D2: Description — How Precisely Are You Communicating?
Description quality determines output quality. This is where specification precision directly translates to better AI output.
Vague description (forces AI to guess):
Add validation to the form
Precise description (eliminates guessing):
## Form Validation Requirements
### Email Field
- Required field
- Must match RFC 5322 email format
- Show error: "Please enter a valid email address"
- Validate on blur (not on keystroke)
### Password Field
- Required field
- Minimum 12 characters
- At least 1 uppercase, 1 lowercase, 1 digit, 1 special character
- Show strength indicator (weak/medium/strong)
- Show error only after first submission attempt
### Submit Behavior
- Disable button while submitting
- Show spinner during API call
- On 400 response: show field-level errors from API
- On 500 response: show generic error banner
The precise description produces exactly the validation behavior you need. The vague description produces whatever the AI considers "standard" validation.
D3: Discernment — Can You Evaluate AI Output?
Discernment is the ability to assess whether AI output meets your requirements. This is why specifications with acceptance criteria are essential — they give you a checklist for evaluation.
Without a specification, you evaluate subjectively: "Does this look right?"
With a specification, you evaluate objectively: "Does this satisfy each acceptance criterion?"
## Acceptance Criteria Checklist
- [x] Email validation uses RFC 5322 format
- [x] Password requires 12+ characters
- [ ] Password strength indicator shows weak/medium/strong ← MISSING
- [x] Submit button disables during API call
- [ ] Field-level errors from 400 response ← NOT IMPLEMENTED
- [x] Generic error on 500 response
Discernment transforms code review from opinion-based to criteria-based.
D4: Diligence — When Should You NOT Use AI?
Diligence is knowing when AI is not the right tool:
- Security-critical logic: Encryption, authentication, authorization — always human-review
- Data-sensitive operations: Anything involving PII, financial data, or compliance
- Novel algorithms: Unique business logic that doesn't match common patterns
- Ambiguous requirements: When you're still figuring out what to build
Expert Insight: The 4Ds create a feedback loop. As your Description improves (D2), your AI output improves, which makes Discernment easier (D3), which teaches you what to Delegate more effectively (D1), which helps you know when to exercise Diligence (D4).
Token Management and Context Windows
Every AI model has a context window — the total amount of text it can process at once. Understanding how to manage this window is a core context engineering skill.
The Token Budget
A typical context window (as of 2026) ranges from 100K to 2M tokens. But not all tokens are equal:
Token Budget Allocation:
System prompt & rules: 5-10%
Loaded project files: 30-40%
Conversation history: 20-30%
Current task context: 15-25%
AI response space: 10-15%
Total: 100% of context window
If you load too many files, conversation history shrinks. If conversation history is too long, there's less room for project context. Managing this budget is essential.
The Lost in the Middle Effect
Research shows that LLMs perform best when important information is at the beginning or end of the context window. Information in the middle receives less attention — a phenomenon called "lost in the middle."
Attention Distribution in Context Window:
Position: [START ■■■■■■ MIDDLE ■■■ END ■■■■■■]
Attention: [HIGH ████ LOW ██ HIGH ████ ]
Practical implications for SDD:
- Put the most important specification sections (acceptance criteria, constraints) at the start of your prompt
- Place supporting context (background, references) in the middle
- End with the specific instruction or question
- For long specifications, summarize key requirements at the top before the full detail
Context Prioritization Strategy
When your context window fills up, prioritize:
Always include (highest priority):
- Current specification being implemented
- Acceptance criteria and constraints
- API contracts and data models
- Coding standards and conventions
Include when relevant (medium priority): 5. Related specification files 6. Existing code patterns to follow 7. Test patterns and examples 8. Architecture decision records
Load on demand (lower priority): 9. Full project documentation 10. Historical conversation context 11. General reference material
Context Isolation and Multi-Session Workflows
Complex projects require multiple AI sessions. Context isolation prevents cross-contamination between unrelated tasks.
The Task Similarity Framework
Not all tasks benefit from sharing context. Score task similarity to decide:
| Characteristic | Points | Example |
|---|---|---|
| Same domain | +30 | Both authentication tasks |
| Same data models | +20 | Both work with User model |
| Same service/component | +20 | Both modify auth service |
| Same file paths | +15 | Both work in api/ folder |
| Same test patterns | +15 | Both need integration tests |
Total possible: 100 points
- 50+ points: Work together (shared context adds value)
- Under 50 points: Isolate (patterns may interfere)
Multi-Session Patterns
Sequential Isolation: Complete one task per session, then start fresh.
Parallel with Checkpoints: Long tasks across multiple sessions with checkpoint documents.
Convergent Isolation: Independent features developed in isolation, then integrated.
Checkpoint Documents
When splitting work across sessions, create a checkpoint document:
# Checkpoint: Auth Feature — Session 1
## Decisions Made
- JWT with RS256 algorithm for token signing
- Access token: 15 min TTL, refresh token: 7 day TTL
- Token storage: httpOnly cookies (not localStorage)
- Rate limiting: 5 failed attempts, then 15-min lockout
## Completed
- User model and migration
- Registration endpoint with email validation
- Login endpoint with JWT generation
- 8 tests passing (unit + integration)
## Next Steps
- Password reset flow
- Token refresh endpoint
- Session management (logout, list active sessions)
## Critical Context
- Using bcrypt cost factor 12 (decided based on benchmarks)
- Email uniqueness enforced at database level (UNIQUE constraint)
- All passwords validated against HaveIBeenPwned API
Load this checkpoint at the start of your next session to restore context efficiently.
Context Engineering for SDD Projects
In specification-driven projects, context engineering takes a specific form:
The SDD Context Stack
Each layer has a role. The constitution ensures every generation follows project principles. The feature specification defines what to build. The implementation plan defines how. Related specifications prevent integration conflicts. Existing patterns ensure consistency.
Practical Setup
For a Cursor-based SDD workflow:
- Project rules (
.cursor/rules/): Constitution principles, coding standards, error handling patterns — loaded automatically - Current spec file: Open the specification you're implementing — loaded as editor context
- Related specs: Reference in your prompt or open as additional tabs
- Implementation plan: Open alongside the spec
- Existing patterns: Point AI to a well-implemented similar feature as a template
Prompt structure for SDD implementation:
"Given:
- The specification in spec-user-registration.md (open file)
- The implementation plan in plan-user-registration.md (open file)
- The existing auth module in src/auth/ as a pattern reference
Implement Task 3: Registration endpoint with email validation.
Follow the API contract exactly as specified.
Write tests first, matching the acceptance criteria."
Tutorial: Build a Context Pack for a Single Feature
This tutorial turns the chapter concepts into a repeatable workflow. You will assemble a context pack for one feature, verify that each layer is present, and produce a short checkpoint document that another session could resume from.
Prerequisites
- A project with a
specs/directory and at least one feature folder such asspecs/004-real-time-chat/ - A project constitution or equivalent rules document
- One implementation plan and one code area you can use as a pattern reference
- An AI coding assistant that can work with open files or explicit file references
Step 1: Pick the Feature and Gather the Core Files
Choose one feature directory and list the files you expect to load first:
specs/[branch-name]/spec.mdspecs/[branch-name]/plan.mdmemory/constitution.mdor your project rules file- One similar implementation area in
src/
For the running project, a minimal pack might be:
specs/004-real-time-chat/spec.md
specs/004-real-time-chat/plan.md
memory/constitution.md
src/chat/
Expected result: You have a concrete feature pack instead of a vague idea of "load some context."
Step 2: Score Each Item by Priority
Before loading everything, classify each file:
| Item | Why it matters | Priority |
|---|---|---|
spec.md | Defines the feature contract | Always load |
plan.md | Defines architecture and sequencing | Always load |
| Constitution / rules | Prevents style and policy drift | Always load |
| Similar code pattern | Improves consistency | Load when relevant |
| Related specs | Prevents integration conflicts | Load when relevant |
| Old conversations | Useful only if still current | Load last |
If your context window is tight, summarize low-priority material before loading it.
Step 3: Create the Implementation Prompt
Use the context pack to build a prompt that names the files explicitly:
Given:
- specs/004-real-time-chat/spec.md
- specs/004-real-time-chat/plan.md
- memory/constitution.md
- src/chat/ as a pattern reference
Implement the next task for message delivery.
Match the API contract and acceptance criteria exactly.
List any assumptions before writing code.
Expected result: The prompt tells the model what to read, what to follow, and what not to invent.
Step 4: Check for Lost-in-the-Middle Risk
Before sending the prompt, make sure the most important constraints are near the top or bottom:
- Put acceptance criteria and non-negotiable constraints near the top.
- Put bulky reference material in the middle.
- End with the specific instruction.
If the spec is long, prepend a 5-10 line summary of the feature contract.
Step 5: Create a Checkpoint Document
After one focused session, capture the outcome in a checkpoint:
# Checkpoint: 004-real-time-chat - Session 1
## Completed
- Reviewed spec and plan
- Implemented message validation
- Added failing tests for unauthorized access
## Decisions Made
- Message length capped at 4000 characters
- Reuse existing auth middleware for room membership checks
## Next Steps
- Implement room participant authorization
- Add integration test for non-participant access
## Critical Context
- Contract source: specs/004-real-time-chat/contracts/
- Follow src/chat/ message persistence pattern
Expected result: Another engineer or a fresh AI session can resume without rereading the entire conversation.
Step 6: Review the Context Pack
Ask four quick questions:
- Did I load the current feature contract?
- Did I load the rules that constrain implementation?
- Did I include one strong pattern reference?
- Did I capture enough state to resume later?
If any answer is "no," fix that before the next session.
Try With AI
Prompt 1: Context Audit
"I'm going to start implementing a feature. Before I write any code, help me audit my context: (1) what files should I have loaded? (2) what project conventions should I remind you of? (3) what patterns from existing code should we follow? (4) what specifications does this feature depend on? Help me build the optimal context before we begin."
Prompt 2: Token Budget Planning
"I'm working on a project with [describe project size]. My AI has a [size] context window. Help me create a token budget: how much space for system rules, how much for the current spec, how much for existing code patterns, and how much for conversation. What's the most efficient way to load context?"
Prompt 3: The Lost-in-the-Middle Test
"I'm going to give you a long specification with a deliberate error buried in the middle. Your task: find the error. After you find it (or don't), we'll discuss how the position of information in context affects your performance, and what I can do to help you catch everything."
Prompt 4: Checkpoint Practice
"We're going to simulate a multi-session workflow. I'll describe the current state of a feature implementation. Help me write a checkpoint document that captures all decisions, progress, and critical context in under 500 tokens. Then start a 'new session' where you implement the next step using only the checkpoint."
Practice Exercises
Exercise 1: Context Quality Comparison
Implement the same specification twice:
- First time: just paste the spec and say "implement this"
- Second time: load the project constitution, coding standards, an existing pattern example, and the spec, then say "implement this following the loaded patterns"
Compare the output quality, consistency with your project style, and adherence to conventions.
Expected outcome: The context-rich version will be significantly more consistent with your project's existing code.
Exercise 2: Token Budget Exercise
Count the tokens in your most important specification files (rough estimate: 1 token per 4 characters). Create a loading priority list for a 128K token context window. What fits? What needs to be summarized? What can be loaded on demand?
Expected outcome: A prioritized context loading plan that maximizes important information within token limits.
Exercise 3: Multi-Session Planning
Plan a multi-session development workflow for a feature with 5+ components. Score task similarity for all pairs, design the session structure (sequential, parallel, or convergent), and write checkpoint templates for each session boundary.
Expected outcome: A complete session plan with checkpoint documents, demonstrating systematic context management across sessions.
Key Takeaways
-
Context engineering is the discipline of managing everything an AI knows when it processes your requests. It's broader and more impactful than prompt engineering alone.
-
The 4Ds framework (Delegation, Description, Discernment, Diligence) provides a systematic approach to AI collaboration.
-
Token management requires budgeting context window space across system rules, project files, conversation history, and response space.
-
The "lost in the middle" effect means important information should be placed at the start or end of context, not buried in the middle.
-
Context isolation prevents unrelated tasks from contaminating each other. Use the task similarity framework to decide when to share or isolate context.
-
Checkpoint documents preserve decisions, progress, and critical context across sessions.
-
In SDD projects, the context stack (constitution → specification → plan → patterns) ensures consistent, spec-compliant code generation.
Chapter Quiz
-
What is the difference between prompt engineering and context engineering?
-
Name the 4Ds framework and explain each D in one sentence.
-
What is the "lost in the middle" effect and how do you mitigate it?
-
A project has 200 specification files totaling 500K tokens. Your context window is 128K tokens. What strategy do you use?
-
Two tasks score 30 on the similarity framework. Should they share a session? Why?
-
What belongs in a checkpoint document?
-
Describe the SDD context stack and explain why each layer matters.
-
When should you exercise "diligence" (D4) and avoid using AI?
-
How does context engineering improve the output of specification-driven code generation?
-
Design a token budget for a session implementing a feature with: project rules (5K tokens), feature spec (8K tokens), implementation plan (6K tokens), 3 related specs (4K each), and 5 existing code files (3K each).