fault at a glance

Fault Injection

fault run
   --with-latency
   --latency-mean 300
   --latency-stddev 80

Change Latency

fault run
   --with-bandwidth
   --bandwidth-rate 512
   --bandwidth-unit Kbps

Change Bandwidth

fault run
   --with-blackhole

Send to Blackhole

fault run
   --with-packet-loss

Lose Packets

fault run
   --with-latency
   --latency-mean 300
   --latency-sched "start:5%,duration:25%;start:70%"
   --with-bandwidth
   --bandwidth-rate 512
   --bandwidth-sched "start:55%,duration:30%"
   --duration 10m
   --upstream=*
   
    Welcome to fault. — Your Reliability Toolbox!

    Enabled Proxies:
     - http://127.0.0.1:3180 [HTTP: forward]
       Target Upstreams: 🎯 All Hosts
     - http://127.0.0.1:3180 [HTTP: tunnel]
       Target Upstreams: 🎯 All Hosts

    Plugins:
     No plugins provided.

    Configured Faults:
     - Latency: per read/write: false, side: Server, direction: Ingress,
       distribution: Normal, mean: 300ms
     - Bandwidth: side: Server, direction: Ingress, rate: 512 Bps

    Faults Schedule:
        Latency: ...██████████████......................████████████████
      Bandwidth: ...............................████████████████........

    Status:
     ⠹ Progress: --🐢------------------------------------------------ 🏁
     Elapsed 00:00:12  | Remaining 98.00% | Total 10 minutes
     ⠏ Waiting for incoming traffic...             

Create Complex Scenarios

Platform Injection

fault inject aws
   --with-latency
   --latency-mean 300
   --latency-stddev 80
   --duration 30s
   
 Deploying fault...
  Injected into service my-service-45tr for 30s
  

AWS Fault Injection

fault inject gcp
   --with-latency
   --latency-mean 300
   --latency-stddev 80
   --duration 30s
   
 Deploying fault...
  Injected into service my-service-45tr for 30s
  

GCP Fault Injection

fault inject kubernetes
   --with-latency
   --latency-mean 300
   --latency-stddev 80
   --duration 30s
   
 Deploying fault...
  Injected into service my-service-45tr for 30s
  

Kubernetes Fault Injection

Scenario Automation

fault scenario generate
   --spec-url http://example.com/openapi.json
   --scenario-file scenario.yaml

Generated 8 reliability scenarios across 1 endpoints!

Generate Scenario

fault scenario run
   --scenario-file scenario.yaml
   
================ Running Scenarios ================

⠏  1/1  [00:00:00] Single high-latency spike (client ingress) ▮ [GET http://localhost:9090/]
⠏  6/6  [00:00:00] Stair-step latency growth (5 x 100 ms) ▮▮▮▮▮▮ [GET http://localhost:9090/]
⠏  1/1  [00:00:10] Periodic 150-250 ms latency pulses during load ▮ [GET http://localhost:9090/]
⠏  1/1  [00:00:00] 5% packet loss for 4s ▮ [GET http://localhost:9090/]
⠏  1/1  [00:00:00] High jitter (±80ms @ 8Hz) ▮ [GET http://localhost:9090/]
⠏  1/1  [00:00:15] 512 KBps bandwidth cap ▮ [GET http://localhost:9090/]
⠏  1/1  [00:00:08] Random 500 errors (5% of calls) ▮ [GET http://localhost:9090/]
⠏  1/1  [00:00:10] Full black-hole for 1s ▮ [GET http://localhost:9090/]                                                    

===================== Summary =====================

Tests run: 13, Tests failed: 0
Total time: 45.1s

Run Scenario

AI

Reviews

fault agent code-review --scenario-dir app/

AI Code Reviews

fault agent scenario-review

AI Scenario Reviews

fault agent platform-review kubernetes

AI Platform Reviews

AI

Agent

 

fault agent tool

AI MCP Server

fault agent tool

AI MCP Server

  • Integrate with any MCP client (Kilo Code, Cursor...)
fault agent tool

AI MCP Server

  • Integrate with any MCP client (Kilo Code, Cursor...)
  • Expose fault capabilities over MCP tools
fault agent tool

AI MCP Server

  • Integrate with any MCP client (Kilo Code, Cursor...)
  • Expose fault capabilities over MCP tools
  • Generate unified-diff changesets against your code
 @app.get("/")
-def index():
-    return httpx.get(f"{UPSTREAM_URL}/todos/1", headers={
-        "Host": "jsonplaceholder.typicode.com"
-    }).json()
+async def index():
+    '''
+    Retrieves a todo item from the upstream service.
+    Implements retry logic, timeouts, and error handling for resilience.
+    '''
+    try:
+        response = await app.state.client.get("/todos/1")
+        response.raise_for_status()
+    except httpx.RequestError as exc:
+        logger.error(f"Request error: {exc}")
+        raise HTTPException(status_code=503, detail="Upstream service unavailable")
+    except httpx.HTTPStatusError as exc:
+        logger.error(f"Invalid response {exc.response.status_code}: {exc}")
+        raise HTTPException(status_code=502, detail="Upstream returned an error")
+
+    return response.json()