Skip to content

Chaos Engineering

Chaos engineering lets you simulate real-world failure conditions — slow APIs, intermittent errors, service outages — so you can verify your application handles them gracefully.

Terminal window
# Start mockd with a mock endpoint
mockd serve &
mockd http add --path /api/users --body '[{"id":1,"name":"Alice"}]'
# Enable chaos: 200ms latency + 10% error rate
mockd chaos enable --latency 200ms --error-rate 0.1 --error-code 503
# Test it — some requests will be slow, some will fail
curl http://localhost:4280/api/users
curl http://localhost:4280/api/users
curl http://localhost:4280/api/users
# Check current chaos settings
mockd chaos status
# Disable when done
mockd chaos disable
Terminal window
mockd chaos enable [flags]
FlagTypeDefaultDescription
--latencystringRandom latency range (e.g., 200ms, 100ms-500ms)
--error-ratefloat0Fraction of requests that return errors (0.0–1.0)
--error-codeint500HTTP status code for error responses
--pathstringRegex pattern to scope chaos to specific paths
--probabilityfloat1.0Probability of applying chaos at all (0.0–1.0)
Terminal window
mockd chaos status

Returns the current chaos configuration (latency, error rate, affected paths).

Terminal window
mockd chaos disable

Immediately removes all chaos injection. Requests return to normal behavior.

Instead of manually configuring latency and error rates, use one of 10 built-in chaos profiles that simulate common failure scenarios:

Terminal window
# Apply a profile at startup
mockd serve --chaos-profile flaky
# Or apply at runtime
mockd chaos apply flaky
# List available profiles
mockd chaos profiles
# Disable when done
mockd chaos disable
ProfileDescriptionLatencyError RateBandwidth
slow-apiSlow upstream API500ms-2s
degradedPartially degraded service200ms-800ms5% (503)
flakyUnreliable with random errors0-100ms20% (500/502/503)
offlineService completely down100% (503)
timeoutConnection timeout simulation30s fixed
rate-limitedRate-limited API50ms-200ms30% (429)
mobile-3gMobile 3G network conditions300ms-800ms2% (503)50 KB/s
satelliteSatellite internet simulation600ms-2s5% (503)20 KB/s
dns-flakyIntermittent DNS resolution failures10% (503)
overloadedOverloaded server under heavy load1s-5s15% (500/502/503/504)100 KB/s
Terminal window
# List all profiles
curl http://localhost:4290/chaos/profiles
# Get a specific profile
curl http://localhost:4290/chaos/profiles/flaky
# Apply a profile
curl -X POST http://localhost:4290/chaos/profiles/flaky/apply

Add a flat 200ms delay to every response:

Terminal window
mockd chaos enable --latency 200ms

Responses take between 100ms and 500ms (uniformly random):

Terminal window
mockd chaos enable --latency 100ms-500ms

10% of requests return HTTP 503:

Terminal window
mockd chaos enable --error-rate 0.1 --error-code 503

Simulate a degraded upstream service — slow responses with occasional failures:

Terminal window
mockd chaos enable --latency 200ms-800ms --error-rate 0.05 --error-code 502

Only affect specific endpoints:

Terminal window
# Chaos only on /api/payments/* routes
mockd chaos enable --latency 500ms --error-rate 0.2 --error-code 500 --path "/api/payments/.*"

Other endpoints continue responding normally.

Apply chaos to only 50% of matching requests:

Terminal window
mockd chaos enable --latency 1s --probability 0.5

You can also manage chaos via the Admin API (port 4290):

Terminal window
curl http://localhost:4290/chaos
Terminal window
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{
"enabled": true,
"latency": {"min": "100ms", "max": "500ms", "probability": 1.0},
"errorRate": {"probability": 0.1, "defaultCode": 503}
}'
Terminal window
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{
"enabled": false
}'

Verify your HTTP client’s timeout handling:

Terminal window
# Set latency higher than your client's timeout
mockd chaos enable --latency 10s
# Your app should timeout and handle it gracefully
curl --max-time 3 http://localhost:4280/api/users
# curl: (28) Operation timed out after 3000 milliseconds

Verify your circuit breaker trips after enough failures:

Terminal window
# High error rate to trigger circuit breaker
mockd chaos enable --error-rate 0.8 --error-code 503
# Run your app and verify the circuit opens
# Then disable chaos and verify it closes
mockd chaos disable

Verify your retry logic with intermittent failures:

Terminal window
# Low error rate — retries should succeed
mockd chaos enable --error-rate 0.3 --error-code 500

Run chaos in your test pipeline to catch resilience regressions:

#!/bin/bash
# Start mockd with your API mocks
mockd serve --config mocks.yaml &
sleep 2
# Run happy-path tests first
pytest tests/integration/ || exit 1
# Enable chaos and run resilience tests
mockd chaos enable --latency 500ms --error-rate 0.1 --error-code 503
pytest tests/resilience/ || exit 1
# Clean up
mockd chaos disable
mockd stop

Simulate a service getting progressively worse:

Terminal window
# Start mild
mockd chaos enable --latency 50ms --error-rate 0.01
# Get worse
mockd chaos enable --latency 200ms --error-rate 0.05
# Service is struggling
mockd chaos enable --latency 1s --error-rate 0.2 --error-code 503
# Full outage
mockd chaos enable --error-rate 1.0 --error-code 503
# Recovery
mockd chaos disable

All chaos commands support --json for scripting:

Terminal window
mockd chaos status --json
{
"enabled": true,
"latency": "200ms",
"errorRate": 0.1,
"errorCode": 503
}

In addition to the 8 basic fault types (latency, error, timeout, corrupt body, empty response, slow body, connection reset, partial response), mockd supports 4 stateful fault types that maintain state across requests — simulating real-world failure patterns that evolve over time.

Simulates a circuit breaker pattern with three states: closed (normal), open (failing), and half-open (testing recovery).

Terminal window
# Configure via Admin API with rules
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{
"enabled": true,
"rules": [{
"pathPattern": "/api/payments/.*",
"faults": [{
"type": "circuit_breaker",
"probability": 1.0,
"circuitBreaker": {
"failureThreshold": 5,
"recoveryTimeout": "30s",
"halfOpenRequests": 2,
"tripStatusCode": 503
}
}]
}]
}'

After failureThreshold failures, the circuit opens and all requests get 503. After recoveryTimeout, it enters half-open state and allows halfOpenRequests test requests through. If those succeed, it closes; if they fail, it re-opens.

Terminal window
# Monitor circuit breaker state
mockd chaos faults
# Manually trip or reset
mockd chaos circuit-breaker trip 0:0
mockd chaos circuit-breaker reset 0:0

Returns 429 Too Many Requests or 503 Service Unavailable with a Retry-After header. After the specified duration, requests pass through normally.

Terminal window
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{
"enabled": true,
"rules": [{
"pathPattern": "/api/.*",
"faults": [{
"type": "retry_after",
"probability": 1.0,
"retryAfter": {
"statusCode": 429,
"retryAfterSeconds": 30
}
}]
}]
}'

Latency increases with each request, simulating a service that gets slower under load. Optionally starts returning errors after enough requests.

Terminal window
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{
"enabled": true,
"rules": [{
"pathPattern": "/api/.*",
"faults": [{
"type": "progressive_degradation",
"probability": 1.0,
"progressiveDegradation": {
"initialDelay": "10ms",
"delayIncrement": "50ms",
"maxDelay": "5s",
"errorAfterRequests": 100,
"errorStatusCode": 503
}
}]
}]
}'

Delivers the response body in timed chunks instead of all at once, simulating slow or unstable network transfers.

Terminal window
curl -X PUT http://localhost:4290/chaos -H 'Content-Type: application/json' -d '{
"enabled": true,
"rules": [{
"pathPattern": "/api/.*",
"faults": [{
"type": "chunked_dribble",
"probability": 1.0,
"chunkedDribble": {
"chunkCount": 5,
"totalDuration": "2s"
}
}]
}]
}'

Use the CLI or MCP tools to inspect stateful fault state:

Terminal window
# View all stateful fault instances
mockd chaos faults
# Via MCP tool
# get_stateful_faults — returns circuit breaker states, retry-after counters, degradation progress
# manage_circuit_breaker — trip or reset circuit breakers by key
Fault TypeCategoryDescription
latencyBasicAdds random latency to responses
errorBasicReturns error status codes
timeoutBasicSimulates connection timeout
corrupt_bodyBasicCorrupts response body data
empty_responseBasicReturns empty body
slow_bodyBasicDrip-feeds response data slowly
connection_resetBasicSimulates TCP connection reset
partial_responseBasicTruncates response at random point
circuit_breakerStatefulClosed → open → half-open state machine
retry_afterStateful429/503 with Retry-After header, auto-recovers
progressive_degradationStatefulLatency increases over time, optional errors
chunked_dribbleStatefulDelivers body in timed chunks
  • Chaos applies to all protocols that run over HTTP (HTTP mocks, GraphQL, SOAP, SSE). gRPC and MQTT have their own transports and are not affected by HTTP chaos.
  • Latency is added on top of any delayMs configured on individual mocks.
  • When both latency and error rate are enabled, the error check happens first — if a request is selected for an error, it returns immediately with the error code (no latency added).
  • Chaos settings are runtime-only — they reset when mockd restarts. They are not persisted in config files.
  • Stateful faults use a rules-based configuration with pathPattern matching, allowing different fault types on different routes.
  • Use get_stateful_faults (MCP) or mockd chaos faults (CLI) to monitor stateful fault state machines.