Chaos Engineering
Chaos engineering lets you simulate real-world failure conditions — slow APIs, intermittent errors, service outages — so you can verify your application handles them gracefully.
Quick Start
Section titled “Quick Start”# Start mockd with a mock endpointmockd serve &mockd http add --path /api/users --body '[{"id":1,"name":"Alice"}]'
# Enable chaos: 200ms latency + 10% error ratemockd chaos enable --latency 200ms --error-rate 0.1 --error-code 503
# Test it — some requests will be slow, some will failcurl http://localhost:4280/api/userscurl http://localhost:4280/api/userscurl http://localhost:4280/api/users
# Check current chaos settingsmockd chaos status
# Disable when donemockd chaos disableCLI Commands
Section titled “CLI Commands”Enable Chaos
Section titled “Enable Chaos”mockd chaos enable [flags]| Flag | Type | Default | Description |
|---|---|---|---|
--latency | string | — | Random latency range (e.g., 200ms, 100ms-500ms) |
--error-rate | float | 0 | Fraction of requests that return errors (0.0–1.0) |
--error-code | int | 500 | HTTP status code for error responses |
--path | string | — | Regex pattern to scope chaos to specific paths |
--probability | float | 1.0 | Probability of applying chaos at all (0.0–1.0) |
Check Status
Section titled “Check Status”mockd chaos statusReturns the current chaos configuration (latency, error rate, affected paths).
Disable Chaos
Section titled “Disable Chaos”mockd chaos disableImmediately removes all chaos injection. Requests return to normal behavior.
Chaos Profiles
Section titled “Chaos Profiles”Instead of manually configuring latency and error rates, use one of 10 built-in chaos profiles that simulate common failure scenarios:
# Apply a profile at startupmockd serve --chaos-profile flaky
# Or apply at runtimemockd chaos apply flaky
# List available profilesmockd chaos profiles
# Disable when donemockd chaos disableAvailable Profiles
Section titled “Available Profiles”| Profile | Description | Latency | Error Rate | Bandwidth |
|---|---|---|---|---|
slow-api | Slow upstream API | 500ms-2s | — | — |
degraded | Partially degraded service | 200ms-800ms | 5% (503) | — |
flaky | Unreliable with random errors | 0-100ms | 20% (500/502/503) | — |
offline | Service completely down | — | 100% (503) | — |
timeout | Connection timeout simulation | 30s fixed | — | — |
rate-limited | Rate-limited API | 50ms-200ms | 30% (429) | — |
mobile-3g | Mobile 3G network conditions | 300ms-800ms | 2% (503) | 50 KB/s |
satellite | Satellite internet simulation | 600ms-2s | 5% (503) | 20 KB/s |
dns-flaky | Intermittent DNS resolution failures | — | 10% (503) | — |
overloaded | Overloaded server under heavy load | 1s-5s | 15% (500/502/503/504) | 100 KB/s |
Admin API for Profiles
Section titled “Admin API for Profiles”# List all profilescurl http://localhost:4290/chaos/profiles
# Get a specific profilecurl http://localhost:4290/chaos/profiles/flaky
# Apply a profilecurl -X POST http://localhost:4290/chaos/profiles/flaky/applyExamples
Section titled “Examples”Fixed Latency
Section titled “Fixed Latency”Add a flat 200ms delay to every response:
mockd chaos enable --latency 200msRandom Latency Range
Section titled “Random Latency Range”Responses take between 100ms and 500ms (uniformly random):
mockd chaos enable --latency 100ms-500msError Injection
Section titled “Error Injection”10% of requests return HTTP 503:
mockd chaos enable --error-rate 0.1 --error-code 503Combined Latency + Errors
Section titled “Combined Latency + Errors”Simulate a degraded upstream service — slow responses with occasional failures:
mockd chaos enable --latency 200ms-800ms --error-rate 0.05 --error-code 502Path-Scoped Chaos
Section titled “Path-Scoped Chaos”Only affect specific endpoints:
# Chaos only on /api/payments/* routesmockd chaos enable --latency 500ms --error-rate 0.2 --error-code 500 --path "/api/payments/.*"Other endpoints continue responding normally.
Partial Application
Section titled “Partial Application”Apply chaos to only 50% of matching requests:
mockd chaos enable --latency 1s --probability 0.5Admin API
Section titled “Admin API”You can also manage chaos via the Admin API (port 4290):
Get Current Settings
Section titled “Get Current Settings”curl http://localhost:4290/chaosEnable Chaos
Section titled “Enable Chaos”curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{ "enabled": true, "latency": {"min": "100ms", "max": "500ms", "probability": 1.0}, "errorRate": {"probability": 0.1, "defaultCode": 503}}'Disable Chaos
Section titled “Disable Chaos”curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{ "enabled": false}'Use Cases
Section titled “Use Cases”Timeout Testing
Section titled “Timeout Testing”Verify your HTTP client’s timeout handling:
# Set latency higher than your client's timeoutmockd chaos enable --latency 10s
# Your app should timeout and handle it gracefullycurl --max-time 3 http://localhost:4280/api/users# curl: (28) Operation timed out after 3000 millisecondsCircuit Breaker Testing
Section titled “Circuit Breaker Testing”Verify your circuit breaker trips after enough failures:
# High error rate to trigger circuit breakermockd chaos enable --error-rate 0.8 --error-code 503
# Run your app and verify the circuit opens# Then disable chaos and verify it closesmockd chaos disableRetry Logic Testing
Section titled “Retry Logic Testing”Verify your retry logic with intermittent failures:
# Low error rate — retries should succeedmockd chaos enable --error-rate 0.3 --error-code 500CI/CD Resilience Tests
Section titled “CI/CD Resilience Tests”Run chaos in your test pipeline to catch resilience regressions:
#!/bin/bash# Start mockd with your API mocksmockd serve --config mocks.yaml &sleep 2
# Run happy-path tests firstpytest tests/integration/ || exit 1
# Enable chaos and run resilience testsmockd chaos enable --latency 500ms --error-rate 0.1 --error-code 503pytest tests/resilience/ || exit 1
# Clean upmockd chaos disablemockd stopGradual Degradation
Section titled “Gradual Degradation”Simulate a service getting progressively worse:
# Start mildmockd chaos enable --latency 50ms --error-rate 0.01
# Get worsemockd chaos enable --latency 200ms --error-rate 0.05
# Service is strugglingmockd chaos enable --latency 1s --error-rate 0.2 --error-code 503
# Full outagemockd chaos enable --error-rate 1.0 --error-code 503
# Recoverymockd chaos disableUsing —json
Section titled “Using —json”All chaos commands support --json for scripting:
mockd chaos status --json{ "enabled": true, "latency": "200ms", "errorRate": 0.1, "errorCode": 503}Stateful Fault Types
Section titled “Stateful Fault Types”In addition to the 8 basic fault types (latency, error, timeout, corrupt body, empty response, slow body, connection reset, partial response), mockd supports 4 stateful fault types that maintain state across requests — simulating real-world failure patterns that evolve over time.
Circuit Breaker
Section titled “Circuit Breaker”Simulates a circuit breaker pattern with three states: closed (normal), open (failing), and half-open (testing recovery).
# Configure via Admin API with rulescurl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{ "enabled": true, "rules": [{ "pathPattern": "/api/payments/.*", "faults": [{ "type": "circuit_breaker", "probability": 1.0, "circuitBreaker": { "failureThreshold": 5, "recoveryTimeout": "30s", "halfOpenRequests": 2, "tripStatusCode": 503 } }] }]}'After failureThreshold failures, the circuit opens and all requests get 503. After recoveryTimeout, it enters half-open state and allows halfOpenRequests test requests through. If those succeed, it closes; if they fail, it re-opens.
# Monitor circuit breaker statemockd chaos faults
# Manually trip or resetmockd chaos circuit-breaker trip 0:0mockd chaos circuit-breaker reset 0:0Retry-After
Section titled “Retry-After”Returns 429 Too Many Requests or 503 Service Unavailable with a Retry-After header. After the specified duration, requests pass through normally.
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{ "enabled": true, "rules": [{ "pathPattern": "/api/.*", "faults": [{ "type": "retry_after", "probability": 1.0, "retryAfter": { "statusCode": 429, "retryAfterSeconds": 30 } }] }]}'Progressive Degradation
Section titled “Progressive Degradation”Latency increases with each request, simulating a service that gets slower under load. Optionally starts returning errors after enough requests.
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{ "enabled": true, "rules": [{ "pathPattern": "/api/.*", "faults": [{ "type": "progressive_degradation", "probability": 1.0, "progressiveDegradation": { "initialDelay": "10ms", "delayIncrement": "50ms", "maxDelay": "5s", "errorAfterRequests": 100, "errorStatusCode": 503 } }] }]}'Chunked Dribble
Section titled “Chunked Dribble”Delivers the response body in timed chunks instead of all at once, simulating slow or unstable network transfers.
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{ "enabled": true, "rules": [{ "pathPattern": "/api/.*", "faults": [{ "type": "chunked_dribble", "probability": 1.0, "chunkedDribble": { "chunkCount": 5, "totalDuration": "2s" } }] }]}'Monitoring Stateful Faults
Section titled “Monitoring Stateful Faults”Use the CLI or MCP tools to inspect stateful fault state:
# View all stateful fault instancesmockd chaos faults
# Via MCP tool# get_stateful_faults — returns circuit breaker states, retry-after counters, degradation progress# manage_circuit_breaker — trip or reset circuit breakers by keyFault Type Reference
Section titled “Fault Type Reference”| Fault Type | Category | Description |
|---|---|---|
latency | Basic | Adds random latency to responses |
error | Basic | Returns error status codes |
timeout | Basic | Simulates connection timeout |
corrupt_body | Basic | Corrupts response body data |
empty_response | Basic | Returns empty body |
slow_body | Basic | Drip-feeds response data slowly |
connection_reset | Basic | Simulates TCP connection reset |
partial_response | Basic | Truncates response at random point |
circuit_breaker | Stateful | Closed → open → half-open state machine |
retry_after | Stateful | 429/503 with Retry-After header, auto-recovers |
progressive_degradation | Stateful | Latency increases over time, optional errors |
chunked_dribble | Stateful | Delivers body in timed chunks |
- Chaos applies to all protocols that run over HTTP (HTTP mocks, GraphQL, SOAP, SSE). gRPC and MQTT have their own transports and are not affected by HTTP chaos.
- Latency is added on top of any
delayMsconfigured on individual mocks. - When both latency and error rate are enabled, the error check happens first — if a request is selected for an error, it returns immediately with the error code (no latency added).
- Chaos settings are runtime-only — they reset when mockd restarts. They are not persisted in config files.
- Stateful faults use a rules-based configuration with
pathPatternmatching, allowing different fault types on different routes. - Use
get_stateful_faults(MCP) ormockd chaos faults(CLI) to monitor stateful fault state machines.