Chapter 29: AI Governance and CI/CD Integration
Learning Objectives
By the end of this chapter, you will be able to:
- Define AI governance and explain why it is essential for AI-generated code
- Implement governance rules: mandatory code review, security scanning, and coverage thresholds
- Add LLM-specific security controls such as prompt-injection and tool-allowlist checks
- Design a spec-validated CI/CD pipeline with explicit gates
- Configure spec validation so incomplete or drifting artifacts fail the pipeline
- Define review policies for low-, medium-, and high-risk AI-generated changes
- Preserve traceability from specification to deployed code
What Is AI Governance?
AI governance is the set of policies, processes, and controls that ensure AI-generated code remains safe, compliant, and high-quality. When AI writes code, you lose the implicit trust that comes from human authorship. Governance restores that trust through verification.
Without governance, AI-generated code can introduce:
- Security vulnerabilities: hardcoded secrets, SQL injection, insecure dependencies
- Compliance violations: missing audit trails, incorrect data handling, weak approvals
- Quality regressions: untested code paths, broken contracts, performance degradation
- Spec drift: code that diverges from specifications without detection
AI governance answers a simple question: How do we ensure that what AI produces is fit for production?
Core Principles
- Never trust, always verify: AI output is treated as untrusted until it passes the required gates.
- Specification as contract: the specification is the source of truth; code must conform.
- Automated gates first: automate what can be verified mechanically; reserve humans for judgment and exceptions.
- Traceability: every deployed artifact should link back to the spec, commit, review, and test results.
- Auditability: approvals, overrides, and failures must be reviewable later.
Governance Rules
Governance rules are the concrete policies that enforce quality. They define what must pass before code reaches production.
Mandatory Code Review
Rule: All medium- and high-risk AI-generated code requires human review before merge.
Implementation:
- Require at least one approval on pull requests
- Require the reviewer to compare the change to the feature spec
- Require two approvals for high-risk paths such as auth, payments, or PII handling
Security Scanning
Rule: All code must pass security scans before deployment.
Common checks:
- SAST on source code
- Dependency audit for known CVEs
- DAST on release candidates or staging where appropriate
LLM-Specific Security Controls
Traditional AppSec scanning is necessary but insufficient for agentic systems. Add explicit controls for:
- prompt injection
- tool misuse / excessive agency
- data exfiltration
- untrusted retrieval poisoning
Example policy:
llm-security:
prompt-injection-tests: required
tool-allowlist: required
network-egress-policy: restricted
pii-redaction-before-model: required
retrieval-source-trust-levels: required
Coverage Thresholds
Rule: Coverage thresholds should be risk-tiered. Use 90%+ for high-risk paths; lower thresholds may be acceptable for low-risk changes with strong contract and integration coverage.
coverage:
low-risk-minimum: 75
medium-risk-minimum: 85
high-risk-minimum: 90
scope: changed-files
Spec Validation
Rule: Code must conform to the specification. Spec violations block merge or deployment.
Common ways to enforce this:
- contract tests
- integration tests mapped to acceptance criteria
- linting for required spec sections
- drift detection between contracts and implementation
The Spec-Validated CI/CD Pipeline
The spec-validated pipeline extends traditional CI/CD with specification-centric gates.
Pipeline Overview
- Specification validation: lint specs and validate contracts
- Build and test: run contract, integration, and coverage checks
- Security scanning: SAST and dependency audit
- Performance validation: budget checks where they are deterministic
- Deployment: deploy only when all required gates pass
Each stage can block the next.
Spec Validation Gates
Spec validation gates are the mechanisms that block merge or deployment when code diverges from the specification.
| Gate | What It Checks | Failure Action |
|---|---|---|
| Spec lint | Required sections and basic structure | Block PR |
| Contract test | API matches OpenAPI or other contract | Block merge |
| Integration test | Acceptance criteria pass | Block merge |
| Coverage | Touched code meets threshold | Block merge |
| LLM security gate | Policy checks for agent behavior | Block merge |
| Drift detection | Code and spec remain aligned | Alert or block |
Drift Detection
Drift occurs when code changes but the spec or contract does not. Detection strategies:
- Contract tests
- Schema comparison
- Requirement-to-test traceability
- Human review against the spec
AI Output Review Policies
When is human review required? When can automated approval suffice?
| Change Type | Risk | Human Review | Automated Approval |
|---|---|---|---|
| Typo fix, docs | Low | Optional | Yes, if checks pass |
| Small bug fix | Medium | Required | Usually no |
| New feature from spec | Medium | Required | No |
| Security-related | High | Required (2 approvals) | Never |
| Dependency update | Medium | Required for majors | Patches may be automated if policy allows |
High-Risk Paths
Always require human review for:
**/auth/****/payments/****/user-data/****/config/production*
Compliance Considerations
Enterprises often need audit trails and traceability. SDD supports this well if you record:
- specification versions and changes
- code generation events
- review decisions
- deployment events
- test and security results
Example Traceability Record
{
"deployment_id": "dep-2026-03-11-001",
"timestamp": "2026-03-11T14:32:00Z",
"spec_version": "005-bookmarks-v1.2",
"spec_path": "specs/005-bookmarks/spec.md",
"commit": "a1b2c3d",
"pr": "123",
"reviewer": "jane@example.com",
"tests_passed": true,
"security_scan": "passed",
"coverage": 94
}
Tutorial: Build a Spec-Validated CI/CD Pipeline with GitHub Actions
This tutorial walks you through building a spec-validated CI/CD pipeline using GitHub Actions. The workflow below is example scaffolding: it is realistic enough to adapt, but you should still tailor the scripts, thresholds, and deployment steps to your stack.
Prerequisites
- A GitHub repository with:
- specifications in
specs/ - an OpenAPI contract in
specs/005-bookmarks/contracts/openapi.yaml - source code and tests
- specifications in
- GitHub Actions enabled
- A Node-based project layout for the example scripts below
Step 1: Add Validation Scripts
Create scripts/spec-lint.mjs:
import fs from "node:fs";
import path from "node:path";
const root = path.resolve("specs");
const requiredSections = ["## Requirements", "## Acceptance Criteria"];
let failures = 0;
function walk(dir) {
for (const entry of fs.readdirSync(dir, { withFileTypes: true })) {
const fullPath = path.join(dir, entry.name);
if (entry.isDirectory()) walk(fullPath);
if (entry.isFile() && entry.name === "spec.md") {
const content = fs.readFileSync(fullPath, "utf8");
for (const section of requiredSections) {
if (!content.includes(section)) {
console.error(`${fullPath} is missing section: ${section}`);
failures += 1;
}
}
}
}
}
if (fs.existsSync(root)) walk(root);
if (failures > 0) process.exit(1);
console.log("Spec lint passed.");
Create scripts/check-coverage.mjs:
import fs from "node:fs";
const threshold = 90;
const summary = JSON.parse(
fs.readFileSync("coverage/coverage-summary.json", "utf8")
);
const lines = summary.total.lines.pct;
if (lines < threshold) {
console.error(`Coverage ${lines}% is below ${threshold}%`);
process.exit(1);
}
console.log(`Coverage ${lines}% meets threshold.`);
These scripts avoid brittle shell globs and extra dependencies such as jq or bc.
Step 2: Create the Workflow File
Create .github/workflows/spec-validated-pipeline.yml:
name: Spec-Validated Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main, develop]
env:
NODE_VERSION: '20'
jobs:
spec-validation:
name: Spec Validation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Lint specifications
run: node scripts/spec-lint.mjs
- name: Validate OpenAPI contracts
run: |
find specs -path "*/contracts/*.yaml" -o -path "*/contracts/*.yml" | while read contract; do
echo "Validating $contract"
npx @redocly/cli lint "$contract"
done
contract-tests:
name: Contract Tests
needs: spec-validation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run contract tests
run: npm run test:contract
- name: Run integration tests
run: npm run test:integration
- name: Run coverage
run: npm run test:coverage
- name: Check coverage threshold
run: node scripts/check-coverage.mjs
security-scan:
name: Security Scan
needs: spec-validation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run npm audit
run: npm audit --audit-level=high
- name: Run Semgrep
uses: semgrep/semgrep-action@v1
with:
config: p/security-audit
performance-budget:
name: Performance Budget
needs: [contract-tests, security-scan]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Build
run: npm run build
- name: Check bundle size
run: |
if [ -d "dist" ]; then
SIZE=$(du -sb dist | cut -f1)
MAX_SIZE=524288
if [ "$SIZE" -gt "$MAX_SIZE" ]; then
echo "Bundle size $SIZE exceeds $MAX_SIZE bytes"
exit 1
fi
fi
deploy:
name: Deploy
needs: [contract-tests, security-scan, performance-budget]
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get spec version
id: spec
run: |
SPEC_VERSION=$(git describe --tags --always)
echo "version=$SPEC_VERSION" >> $GITHUB_OUTPUT
- name: Create deployment record
run: |
echo '{"deployment":"'${{ github.run_id }}'","spec":"'${{ steps.spec.outputs.version }}'","commit":"'${{ github.sha }}'"}' > deployment-record.json
- name: Deploy to staging
run: echo "Deploying with spec tag ${{ steps.spec.outputs.version }}"
Expected result: Pull requests fail if specs are malformed, contracts are invalid, tests fail, or coverage drops below threshold.
Step 3: Configure package.json Scripts
Ensure package.json contains:
{
"scripts": {
"test:contract": "vitest run tests/contract",
"test:integration": "vitest run tests/integration",
"test:coverage": "vitest run --coverage",
"spec:lint": "node scripts/spec-lint.mjs",
"coverage:check": "node scripts/check-coverage.mjs"
}
}
Step 4: Add Branch Protection
In GitHub:
- Open
Settings -> Branches - Add a rule for
main - Require status checks:
spec-validationcontract-testssecurity-scanperformance-budget
- Require pull request review before merging
- Require additional approvals for high-risk paths if needed
Step 5: Add the Missing Checks Carefully
Only add the following when they are deterministic in your environment:
- API latency checks
- DAST scans
- Snyk or other secret-dependent tooling
- production deployment approvals
If a check needs seed data, service orchestration, or credentials, document that explicitly instead of pretending it is universally copy-paste runnable.
Governance Frameworks by Team Size
Governance should scale with team size and risk.
Startup (1-10 engineers)
Focus: speed with safety.
- Spec lint + contract tests in CI
- 1 mandatory review
- Basic dependency audit
- Coverage target often lower for low-risk paths
Mid-Size (10-50 engineers)
Focus: consistency and quality.
- Full pipeline
- 1-2 approvals depending on path
- SAST + dependency audit
- Stronger coverage requirements
- Basic traceability in release notes or deployment records
Enterprise (50+ engineers)
Focus: compliance, auditability, and risk management.
- Full pipeline plus DAST and manual production approvals where required
- 2 approvals for high-risk paths
- Full audit trail retention
- Dedicated governance ownership
Try With AI
Prompt 1: Governance Policy Design
"Our team is adopting SDD. We have 15 engineers and handle user PII. Design an AI output review policy: when is human review required vs. automated approval? Include a table of change types and risk levels. Suggest high-risk paths that always need review."
Prompt 2: Pipeline Gate Configuration
"I have a GitHub Actions workflow. Add a spec validation stage that: (1) lints markdown specs in
specs/for required sections, (2) validates OpenAPI files incontracts/. Show the YAML and any helper scripts needed."
Prompt 3: Traceability Implementation
"We need an audit trail for compliance. Design a JSON schema for deployment records that links deployment ID, timestamp, spec version, spec path, commit SHA, PR number, reviewer, test results, and security scan result."
Prompt 4: Drift Detection Strategy
"How can we detect when code diverges from the specification without the spec being updated? List 3-4 strategies and explain what each one catches and what it might miss."
Practice Exercises
Exercise 1: Add a Spec Validation Gate
Take an existing project with specs. Add a CI job that:
- Lints specs for required sections
- Validates OpenAPI contracts
- Fails the pipeline if either fails
Expected outcome: A working spec validation gate in CI.
Exercise 2: Define Your Review Policy
Create an AI output review policy for your team or a hypothetical team. Include:
- change types and risk levels
- when human review is required
- high-risk paths
- automated approval criteria
Expected outcome: A 1-2 page policy document.
Exercise 3: Implement a Coverage Gate
Add a coverage threshold check to your CI pipeline. If coverage drops below your target for the risk tier you are enforcing, fail the build.
Expected outcome: CI fails when coverage drops below threshold.
Key Takeaways
- AI governance ensures AI-generated code is safe, compliant, and high-quality through policies, gates, and verification.
- Governance rules should cover review, security, coverage, and spec validation.
- The spec-validated CI/CD pipeline extends traditional CI/CD with specification-aware checks.
- Spec validation gates help catch drift before deployment.
- Compliance depends on audit trails and traceability, not just test results.
- Governance should scale with context: startups and enterprises should not use the exact same process.
Chapter Quiz
- What is AI governance, and why is it essential when using AI-generated code?
- What are the main stages in a spec-validated CI/CD pipeline?
- When should human review be required for AI-generated code?
- What is drift detection? Name two ways to implement it.
- What should an audit trail include?
- Why is it dangerous to present a CI example as universally runnable when it depends on hidden credentials or setup?
Back to: Part X Overview | Next: Chapter 30 - Metrics and Engineering Roles