Skip to main content

BUILDING TOOLS FOR ENGINEERS

The best engineers don't just write features.

They build platforms that make entire teams faster.

Platform Engineering is the art of:

  • Creating internal tools

  • Reducing cognitive load

  • Eliminating repetitive work

  • Building "golden paths"

  • Scaling engineering effectiveness

This is how you become force multiplier and reach Staff+ levels.


SECTION 1 — WHAT IS DEVELOPER EXPERIENCE?

Developer Experience (DX) Definition

DX = How easy and pleasant it is for developers to:
- Onboard to codebase
- Build features
- Deploy code
- Debug issues
- Collaborate with team

Great DX = 10x productivity


The DX Stack

┌──────────────────────────────────┐
│ Layer 5: Documentation │
├──────────────────────────────────┤
│ Layer 4: Tools & CLI │
├──────────────────────────────────┤
│ Layer 3: CI/CD Pipeline │
├──────────────────────────────────┤
│ Layer 2: Local Dev Environment │
├──────────────────────────────────┤
│ Layer 1: Code Architecture │
└──────────────────────────────────┘

Each layer impacts developer velocity.


SECTION 2 — THE GOLDEN PATH PRINCIPLE

What is a Golden Path?

A Golden Path is the easiest, most supported way to accomplish common tasks.

Example: Deploying a New Service

Without Golden Path:

# Engineer must figure out:
1. How to create service repo?
2. What dependencies to use?
3. How to set up CI/CD?
4. How to configure monitoring?
5. How to deploy to production?
6. How to handle secrets?
7. How to set up alerting?

Time to first deploy: 2-4 weeks

With Golden Path:

# One command:
$ platform create-service --name=my-api --type=nodejs

✓ Created repo from template
✓ Set up CI/CD pipeline
✓ Configured monitoring
✓ Added to service mesh
✓ Generated API documentation
✓ Ready to deploy

Time to first deploy: 1 hour

Result: 40x faster onboarding


Golden Paths for Common Tasks

1. Creating a New Service

# CLI tool
$ platform new service

? Service name: user-api
? Language: TypeScript
? Database: PostgreSQL
? Queue: Yes (RabbitMQ)

Creating service...
✓ Generated from template
✓ Dependencies installed
✓ Database migrations set up
✓ Tests scaffolded
✓ CI/CD configured
✓ Monitoring added
✓ Documentation created

Next steps:
cd user-api
npm run dev
npm test

2. Deploying to Production

# Single command deployment
$ platform deploy

? Environment: production
? Confirm deployment to prod? Yes

Running pre-deployment checks...
✓ Tests passing
✓ Linting passed
✓ Security scan clean
✓ Database migrations ready

Deploying...
✓ Building Docker image
✓ Running canary deployment (5%)
✓ Health checks passed
✓ Scaling to 50%
✓ Scaling to 100%
✓ Deployment complete

Deployed to: <https://user-api.prod.company.com>
Monitoring: <https://grafana.company.com/d/user-api>

3. Creating a New Database

$ platform db create

? Database name: analytics_db
? Type: PostgreSQL
? Size: Small (10GB)
? Backups: Daily

Creating database...
✓ Provisioned in us-east-1
✓ Configured replication
✓ Set up automated backups
✓ Added monitoring
✓ Generated connection strings

Connection string (saved to secrets):
postgresql://analytics_db:***@db.internal:5432/analytics

Add to your .env:
DATABASE_URL=secrets://analytics_db

SECTION 3 — BUILDING INTERNAL TOOLS

The Internal Tool Hierarchy

┌────────────────────────────────┐
│ Level 4: Platform CLI │ → Orchestrates everything
├────────────────────────────────┤
│ Level 3: Developer Portal │ → Self-service UI
├────────────────────────────────┤
│ Level 2: Automation Scripts │ → Reduce repetition
├────────────────────────────────┤
│ Level 1: Documentation │ → Prevent questions
└────────────────────────────────┘

Level 1: Living Documentation

Problem: Docs are always outdated

Solution: Documentation as Code

# service-template/README.md

# {{SERVICE_NAME}}

## Quick Start
```bash
npm install
npm run dev

Environment Variables

{{#each env_vars}}

  • {{name}}: {{description}} {{#if required}}(required){{/if}}
    {{/each}}

API Endpoints

{{#each endpoints}}

{{method}} {{path}}

{{description}}

Request:

{{request_example}}

Response:

{{response_example}}

{{/each}}

Deployment

platform deploy --env={{environment}}

Auto-generated from code. Last updated: {{timestamp}}


**Generated automatically from:**
- Code annotations
- API schemas
- Config files

**Never out of date.**

---

### **Documentation Portal**

Internal docs site:

├── Getting Started
│ ├── New Engineer Onboarding
│ ├── Setting Up Local Environment
│ └── Deploying Your First Service

├── Architecture
│ ├── System Overview (auto-generated diagram)
│ ├── Service Catalog (auto-discovered)
│ └── Data Flow (live)

├── How-To Guides
│ ├── Add a New Microservice
│ ├── Set Up Database
│ ├── Configure Monitoring
│ └── Debug Production Issues

└── Reference
├── API Documentation (auto-generated)
├── CLI Commands
└── Best Practices


---

## **Level 2: Automation Scripts**

### **Common Engineering Tasks → Scripts**

#### **Script 1: Environment Setup**

```bash
#!/bin/bash
# setup-dev.sh

echo "🚀 Setting up development environment..."

# Install dependencies
if ! command -v docker &> /dev/null; then
echo "Installing Docker..."
curl -fsSL <https://get.docker.com> | sh
fi

# Clone all services
echo "Cloning repositories..."
for service in api frontend worker; do
git clone git@github.com:company/$service.git
done

# Set up databases
echo "Starting databases..."
docker-compose up -d postgres redis

# Install dependencies
echo "Installing dependencies..."
for service in */; do
cd "$service"
if [ -f package.json ]; then
npm install
fi
cd ..
done

# Create .env files
echo "Creating .env files from templates..."
for service in */; do
if [ -f "$service/.env.example" ]; then
cp "$service/.env.example" "$service/.env"
fi
done

echo "✅ Development environment ready!"
echo "Run 'npm run dev' in any service to start."

Time saved: 2 hours → 5 minutes


Script 2: Database Migration Helper

#!/bin/bash
# db-migrate.sh

SERVICE=$1
DIRECTION=${2:-up}

if [ -z "$SERVICE" ]; then
echo "Usage: db-migrate.sh <service> [up|down]"
exit 1
fi

echo "Running migrations for $SERVICE ($DIRECTION)..."

# Backup database first
echo "Creating backup..."
pg_dump $DATABASE_URL > "backup-$(date +%Y%m%d-%H%M%S).sql"

# Run migrations
cd services/$SERVICE
npm run db:migrate:$DIRECTION

# Verify
echo "Verifying schema..."
npm run db:validate

echo "✅ Migration complete"

Script 3: Test Data Generator

#!/bin/bash
# seed-test-data.sh

echo "Generating test data..."

# Create test users
curl -X POST localhost:3000/api/users \\
-H "Content-Type: application/json" \\
-d '{
"email": "test@example.com",
"name": "Test User"
}'

# Create sample products
for i in {1..100}; do
curl -X POST localhost:3000/api/products \\
-H "Content-Type: application/json" \\
-d "{
\\"name\\": \\"Product $i\\",
\\"price\\": $(( RANDOM % 100 + 1 ))
}"
done

echo "✅ Test data created"

Level 3: Developer Portal (Self-Service)

What is a Developer Portal?

A web UI where engineers can:

  • Provision resources

  • Deploy services

  • View system status

  • Access documentation

  • Manage secrets

  • View logs & metrics

Example: Backstage (Spotify)

Home Dashboard:
├── My Services
│ ├── user-api (healthy)
│ ├── payment-service (deploying...)
│ └── worker-queue (degraded)

├── Quick Actions
│ ├── [Create New Service]
│ ├── [Deploy to Production]
│ └── [View Recent Incidents]

├── System Health
│ └── 99.8% uptime (last 7 days)

└── Recent Deployments
├── user-api v1.2.3 (2h ago) ✓
└── frontend v2.0.1 (5h ago) ✓

Service Creation Flow (Self-Service)

Developer Portal UI:

┌────────────────────────────────────┐
│ Create New Service │
├────────────────────────────────────┤
│ Service Name: [payment-api] │
│ Team: [Platform] │
│ Language: ○ Node.js ● Python │
│ Database: ☑ PostgreSQL ☐ MongoDB │
│ Cache: ☑ Redis ☐ Memcached │
│ Queue: ☑ RabbitMQ ☐ None │
│ │
│ [Cancel] [Create Service] │
└────────────────────────────────────┘

(Clicks Create)

Progress:
✓ Creating repository
✓ Generating code from template
✓ Setting up CI/CD
✓ Provisioning database
✓ Configuring monitoring
✓ Adding to service catalog
✓ Notifying team

Done! View your service: [payment-api dashboard]

Level 4: Platform CLI

The Ultimate Developer Tool

# platform CLI - unified interface for everything

$ platform help

Platform CLI v2.0

Usage: platform <command> [options]

Commands:
service Manage services
deploy Deploy applications
db Database operations
logs View logs
secrets Manage secrets
config Configuration management
status Check system status

Run 'platform <command> --help' for more info

CLI Commands Examples

Service Management

# Create service
$ platform service create --name=api --template=nodejs-api

# List services
$ platform service list

# View service details
$ platform service info api

# Delete service
$ platform service delete api

Deployment

# Deploy to staging
$ platform deploy --env=staging

# Deploy with canary
$ platform deploy --env=prod --canary=10

# Rollback
$ platform deploy rollback --env=prod

# View deployment history
$ platform deploy history

Database

# Create database
$ platform db create --name=analytics --type=postgres

# Run migration
$ platform db migrate --service=api

# Create backup
$ platform db backup --name=analytics

# Restore backup
$ platform db restore --backup=analytics-20240115

Logs

# Tail logs
$ platform logs --service=api --follow

# Search logs
$ platform logs --service=api --search="error" --since=1h

# Download logs
$ platform logs --service=api --since=1d > logs.txt

Secrets

# Set secret
$ platform secrets set API_KEY=abc123 --service=api

# List secrets
$ platform secrets list --service=api

# Rotate secret
$ platform secrets rotate DATABASE_PASSWORD --service=api

SECTION 4 — REDUCING COGNITIVE LOAD

The Cognitive Load Formula

Cognitive Load = (Decisions × Complexity) ÷ Automation

Goal: Minimize decisions, simplify complexity, maximize automation


Strategy 1: Sensible Defaults

Bad: Too many decisions

# service-config.yml
database:
host: ?
port: ?
pool_size: ?
timeout: ?
retry_attempts: ?
ssl: ?
logging: ?
cache:
host: ?
port: ?
ttl: ?

Engineer must figure out 10+ config values.


Good: Smart defaults

# service-config.yml (minimal)
database:
name: user_db # That's it!

# Platform provides defaults:
# - host: Discovered from service mesh
# - port: 5432 (standard)
# - pool_size: 10 (sensible default)
# - timeout: 30s (battle-tested)
# - retry_attempts: 3
# - ssl: true (always)
# - logging: error (in prod)

# Override only if needed:
database:
name: user_db
pool_size: 20 # Only override if you need to

Decisions reduced from 10 to 1.


Strategy 2: Convention Over Configuration

Example: Project Structure

# Enforced project structure
service/
├── src/
│ ├── api/ # API endpoints (auto-discovered)
│ ├── models/ # Data models (auto-migrated)
│ ├── services/ # Business logic
│ └── utils/ # Utilities
├── tests/ # Tests (auto-run in CI)
├── migrations/ # DB migrations (auto-applied)
└── config/
├── dev.yml # Dev config
└── prod.yml # Prod config

# Engineers don't decide structure
# They follow convention
# Tools work automatically

Strategy 3: Hide Complexity

Example: Deployment Complexity

Behind the scenes:

1. Build Docker image
2. Push to registry
3. Update Kubernetes manifests
4. Apply rolling update
5. Run health checks
6. Monitor rollout
7. Alert on errors
8. Rollback if needed

Engineer sees:

$ platform deploy

✓ Deployed successfully

All complexity hidden.


SECTION 5 — BUILDING A PLATFORM CLI

Architecture

CLI (Commander.js)

API Client (axios)

Platform API (internal)

Orchestration Layer

├── Kubernetes API
├── Database Provisioning
├── Secret Management
├── Monitoring Setup
└── CI/CD Integration

Implementation Example

1. CLI Entry Point

// src/cli.ts
#!/usr/bin/env node
import { Command } from 'commander';
import { serviceCommands } from './commands/service';
import { deployCommands } from './commands/deploy';
import { dbCommands } from './commands/db';

const program = new Command();

program
.name('platform')
.description('Platform Engineering CLI')
.version('2.0.0');

// Register command groups
serviceCommands(program);
deployCommands(program);
dbCommands(program);

program.parse();

2. Service Commands

// src/commands/service.ts
import { Command } from 'commander';
import { PlatformClient } from '../client';

export function serviceCommands(program: Command) {
const service = program.command('service');

service
.command('create')
.description('Create a new service')
.requiredOption('--name <name>', 'Service name')
.option('--template <template>', 'Template to use', 'nodejs-api')
.action(async (options) => {
console.log('🚀 Creating service...');

const client = new PlatformClient();
const result = await client.createService({
name: options.name,
template: options.template
});

console.log('✓ Service created');
console.log(`Repository: ${result.repoUrl}`);
console.log(`Dashboard: ${result.dashboardUrl}`);
});

service
.command('list')
.description('List all services')
.action(async () => {
const client = new PlatformClient();
const services = await client.listServices();

console.table(services.map(s => ({
Name: s.name,
Status: s.status,
Team: s.team,
'Last Deploy': s.lastDeploy
})));
});
}

3. Platform Client

// src/client.ts
import axios from 'axios';

export class PlatformClient {
private api = axios.create({
baseURL: process.env.PLATFORM_API || '<https://platform.internal>',
headers: {
'Authorization': `Bearer ${this.getToken()}`
}
});

private getToken(): string {
// Load from ~/.platform/credentials
return process.env.PLATFORM_TOKEN || '';
}

async createService(options: {
name: string;
template: string;
}) {
const { data } = await this.api.post('/services', options);
return data;
}

async listServices() {
const { data } = await this.api.get('/services');
return data;
}

async deployService(name: string, env: string) {
const { data } = await this.api.post(`/services/${name}/deploy`, {
environment: env
});
return data;
}
}

4. Interactive Prompts

// src/commands/deploy.ts
import { Command } from 'commander';
import { select, confirm } from '@inquirer/prompts';

export function deployCommands(program: Command) {
program
.command('deploy')
.description('Deploy service')
.action(async () => {
// Interactive environment selection
const env = await select({
message: 'Select environment:',
choices: [
{ name: 'Development', value: 'dev' },
{ name: 'Staging', value: 'staging' },
{ name: 'Production', value: 'prod' }
]
});

// Confirmation for production
if (env === 'prod') {
const confirmed = await confirm({
message: 'Deploy to PRODUCTION?',
default: false
});

if (!confirmed) {
console.log('Deployment cancelled');
return;
}
}

// Deploy
console.log('🚀 Deploying...');
const client = new PlatformClient();
await client.deployService('current-service', env);
console.log('✓ Deployed successfully');
});
}

Advanced Features

1. Progress Bars

import ora from 'ora';

async function deployService() {
const spinner = ora('Building Docker image...').start();
await buildImage();
spinner.succeed('Docker image built');

spinner.start('Pushing to registry...');
await pushImage();
spinner.succeed('Pushed to registry');

spinner.start('Deploying to Kubernetes...');
await deployToK8s();
spinner.succeed('Deployed successfully');
}

2. Rich Output

import chalk from 'chalk';
import Table from 'cli-table3';

function displayServices(services: Service[]) {
const table = new Table({
head: ['Name', 'Status', 'Team', 'Last Deploy'],
style: { head: ['cyan'] }
});

services.forEach(s => {
const status = s.status === 'healthy'
? chalk.green('● Healthy')
: chalk.red('● Degraded');

table.push([
chalk.bold(s.name),
status,
s.team,
s.lastDeploy
]);
});

console.log(table.toString());
}

3. Error Handling

async function handleCommand<T>(
fn: () => Promise<T>
): Promise<T | void> {
try {
return await fn();
} catch (error) {
if (error.response?.status === 401) {
console.error(chalk.red('Authentication failed'));
console.log('Run: platform login');
} else if (error.response?.status === 403) {
console.error(chalk.red('Permission denied'));
} else {
console.error(chalk.red('Error:'), error.message);
console.log('Run with --debug for more details');
}
process.exit(1);
}
}

SECTION 6 — METRICS & MEASURING DX

DX Metrics That Matter

1. Time to First Deploy

Measure: Time from "git clone" to "deployed to prod"

Bad: 2-4 weeks
Good: 1-2 days
Great: < 4 hours

Track: For every new engineer

2. Build Time

Measure: Time from "git push" to "deployed"

Bad: 30-60 minutes
Good: 10-15 minutes
Great: < 5 minutes

Optimize:
- Caching
- Parallel builds
- Incremental builds

3. Deploy Frequency

Measure: Deployments per day

Bad: Weekly
Good: Daily
Great: 10+ per day

Enable:
- Automated testing
- Fast CI/CD
- Confidence in rollbacks

4. Mean Time to Recovery (MTTR)

Measure: Time from "incident detected" to "resolved"

Bad: Hours
Good: < 1 hour
Great: < 15 minutes

Improve:
- Fast rollbacks
- Good monitoring
- Clear runbooks

5. Developer Satisfaction

Measure: Quarterly survey

Questions:
1. How easy is it to build features? (1-10)
2. How confident are you in deploys? (1-10)
3. How easy is debugging? (1-10)
4. How good are internal tools? (1-10)

Target: 8+ average

Tracking Dashboard

Platform Health Dashboard:

┌─────────────────────────────────────┐
│ Developer Experience Metrics │
├─────────────────────────────────────┤
│ Time to First Deploy: 3.2 hours │
│ Build Time (p50): 8 minutes │
│ Deploy Frequency: 15/day │
│ MTTR: 12 minutes │
│ Developer Satisfaction: 8.4/10 │
├─────────────────────────────────────┤
│ Recent Improvements: │
│ ✓ Reduced build time by 40% │
│ ✓ Increased deploy frequency 2x │
│ ✓ New CLI tool adoption: 85% │
└─────────────────────────────────────┘

SECTION 7 — PLATFORM ENGINEERING CAREER PATH

The Platform Engineer Role

Platform Engineers build tools for other engineers.

Responsibilities:
- Build internal platforms
- Improve developer experience
- Create self-service tools
- Reduce cognitive load
- Scale engineering org

Impact:
- 100 engineers × 10% faster = 10 engineers worth of value
- 1000 engineers × 5% faster = 50 engineers worth of value

This is why Platform Engineers are highly paid.


Career Progression

Junior → Mid → Senior → Staff Platform Engineer

Junior:
- Fix bugs in internal tools
- Write scripts
- Improve documentation

Mid:
- Build new internal tools
- Own small platforms
- Gather requirements from engineers

Senior:
- Design platform architecture
- Lead platform initiatives
- Set platform strategy
- Influence engineering culture

Staff:
- Company-wide platform vision
- Multi-year roadmap
- Cross-org impact
- Engineering effectiveness strategy

Skills to Develop

Technical:
- Backend engineering (APIs)
- Infrastructure (Docker, K8s)
- CI/CD systems
- Databases
- Monitoring & observability
- CLI development
- Web development (portals)

Soft Skills:
- Empathy for developers
- Product thinking
- Communication
- Stakeholder management
- Teaching & documentation

Conclusion

Platform Engineering is about leverage.

One great internal tool can:

  • Save thousands of engineering hours

  • Accelerate entire organization

  • Reduce cognitive load

  • Improve developer happiness

Top 1% engineers understand:

  • Build tools, not just features

  • Automate repetition

  • Create golden paths

  • Reduce decisions

  • Measure developer experience

This is how you become indispensable.


This completes PART XI (b) — Developer Experience & Platform Engineering.

Build the tools that make engineers 10x faster.