Explore fault MCP Agent Tools

This guide will take you through the MCP tools supported by fault agent.

Prerequisites

Install fault

If you haven’t installed fault yet, follow the installation instructions.
Get an OpenAI Key

For the purpose of the guide, we will be using OpenAI models. You need to create an API key. Then make sure the key is available for fault:
```
export OPENAI_API_KEY=sk-...
```
The agent works fine with Gemini, ollama and OpenRouter so you may switch to either. You want to have a look at an example below.
Install a local qdrant database

fault uses qdrant for its vector database. You can install a local, free, qdrant using docker:
```
docker run -p 6333:6333 -p 6334:6334 -v "$(pwd)/qdrant_storage:/qdrant/storage:z" qdrant/qdrant
```
While not used by every tools, we suggest you start one up to explore all of them.
Install the Python FastMCP library

fault does not need this library to work but to demonstrate the tools we support, we will be using FastMCP.

fault with Cursor

Below is an example of using fault AI tools in Cursor to help it make the generated code more production ready.

Overview

The fault AI agent is called as follows:

fault agent tool

It supports the MCP specification.

Get a demo application

For the purpose of demonstration, we suggest you run the following server with a single endpoint:

app/app.py

import os

import httpx
from fastapi import FastAPI

UPSTREAM_URL = os.getenv("UPSTREAM_URL", "https://jsonplaceholder.typicode.com")

app = FastAPI()


@app.get("/")
def index():
    return httpx.get(f"{UPSTREAM_URL}/todos/1", headers={
        "Host": "jsonplaceholder.typicode.com"
    }).json()

Install dependencies as follows:

pipuv

pip install fastapi[standard] httpx

uv tool install fastapi[standard] httpx

Then run it as follows:

export UPSTREAM_URL=http://localhost:34000  # (1)!
fastapi dev app/app.py --port 9090

This will ensure the remote call made from the endpoint goes through fault.
Example

For instance, you try without the agent first:
```
fault run --with-latency --latency-mean 300 --proxy "34000=https://jsonplaceholder.typicode.com"
```
If you now connect to the endpoint, it will go through fault and apply the latency on the remote call.

Tool: Source Code Indexing

In order to get the best feedback from fault AI agent, it is a good to index locally your source code so it. When performing certain operations, fault will search it for the right documents to contextualize the query it performs.

Index your source code

Tool "fault_index_source_code"

Payload

{
    "source_dir": "",
    "lang": "python"
}

The source_dir argument is an absolute path to the top-level directory containing code files. The lang argument hints which files to process.

Returns

The string "done" when complete. Any MCP error with a hint of what went wrong otherwise.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

Here is a full working example to calling this tool:

import asyncio
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging

async def main(source_dir: str, verbose: bool) -> None:
    fault_path = shutil.which("fault") # (1)!
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {  # (2)!
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:   # (3)!
        p = await client.call_tool(   # (4)!
            "fault_index_source_code", {
                "source_dir": source_dir,   # (5)!
                "lang": "python"   # (6)!
            })

        print(p)

if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("source_dir")
    args = parser.parse_args()

    asyncio.run(main(args.source_dir, args.verbose))

Locate the fault binary
Prepare a stdio configuration to call the tool
Setup a client that handles agent initialization
Call the source.index tool from fault
Pass the absolute directory to the source code
Hint of the language to index: python, go, rust, javascript...

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: List function names from a file

Context is everything! Being able to focus on a given function helps making the right decisions.

List function names

Tool "fault_list_function_names"

Payload

{
    "file": "",
}

The file argument is an absolute path a source code file.

Returns

An array with any found function names.

Requirements

none

Here is a full working example to calling this tool:

import asyncio
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging

async def main(code_file: str, verbose: bool) -> None:
    fault_path = shutil.which("fault") # (1)!
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {  # (2)!
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
            },
        }
    }

    async with Client(config) as client:   # (3)!
        p = await client.call_tool(   # (4)!
            "fault_list_function_names", {
                "file": code_file,   # (5)!
            })

        print(p)

if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("source_dir")
    args = parser.parse_args()

    asyncio.run(main(args.source_dir, args.verbose))

Locate the fault binary
Prepare a stdio configuration to call the tool
Setup a client that handles agent initialization
Call the source.index tool from fault
Pass the absolute file to the source code

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Extract function block

Extract the whole function block including its signatures and, for some languages, any decorators around the function.

Extract function block

Tool "fault_extract_code_block"

Payload

{
    "file": "",
    "func": ""
}

The file argument is an absolute path a source code file. The func argument is the name of the function within that module.

Returns

A JSON payload with two keys:

full: the complete function block including its signature
body: the function body without its signature

Requirements

none

Here is a full working example to calling this tool:

import asyncio
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging

async def main(code_file: str, func_name: str, verbose: bool) -> None:
    fault_path = shutil.which("fault") # (1)!
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {  # (2)!
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
            },
        }
    }

    async with Client(config) as client:   # (3)!
        p = await client.call_tool(   # (4)!
            "fault_extract_code_block", {
                "file": code_file,   # (5)!
                "func": func_name   # (6)!
            })

        print(p)

if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("source_dir")
    parser.add_argument("func_name")
    args = parser.parse_args()

    asyncio.run(main(args.source_dir, args.verbose))

Locate the fault binary
Prepare a stdio configuration to call the tool
Setup a client that handles agent initialization
Call the source.index tool from fault
Pass the absolute file to the source code
The name of the function

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Function Performance Scoring

Scoring the performance of a function will help you understand how much attention requires this function.

Evaluate function performance

Tool "fault_score_performance"

Payload

{
    "snippet": "",
    "lang": ""
}

The snippet argument is the full code of a function or any code snippet. The lang hints the language of the snippet.

Returns

A JSON payload with two keys:

explanation: a short description of the reasonning for the score
score: the score as a number between 0.0 and 1.0

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

Here is a full working example to calling this tool:

import asyncio
import json
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(code_file: str, func_name: str, lang: str, verbose: bool) -> None:
    fault_path = shutil.which("fault")   # (1)!
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(   # (2)!
            "fault_extract_code_block", {
                "file": code_file,
                "func": func_name
            })

        snippet = json.loads(p[0].text)["full"]   # (3)!

        p = await client.call_tool(
            "fault_score_performance", {
                "snippet": snippet,
                "lang": lang
            })

        print(p)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("code_file")
    parser.add_argument("func_name")
    parser.add_argument("lang")
    args = parser.parse_args()

    asyncio.run(main(args.code_file, args.func_name, args.lang, args.verbose))

Locate the fault binary
Extract the function block
Take the snippet from the tool's response

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Function Reliability Scoring

Scoring the reliability of a function will help you understand how much attention requires this function.

Evaluate function reliability

Tool "fault_score_reliability"

Payload

{
    "snippet": "",
    "lang": ""
}

The snippet argument is the full code of a function or any code snippet. The lang hints the language of the snippet.

Returns

A JSON payload with two keys:

explanation: a short description of the reasonning for the score
score: the score as a number between 0.0 and 1.0

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

Here is a full working example to calling this tool:

import asyncio
import json
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(code_file: str, func_name: str, lang: str, verbose: bool) -> None:
    fault_path = shutil.which("fault")   # (1)!
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(   # (2)!
            "fault_extract_code_block", {
                "file": code_file,
                "func": func_name
            })

        snippet = json.loads(p[0].text)["full"]   # (3)!

        p = await client.call_tool(
            "fault_score_reliability", {
                "snippet": snippet,
                "lang": lang
            })

        print(p)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("code_file")
    parser.add_argument("func_name")
    parser.add_argument("lang")
    args = parser.parse_args()

    asyncio.run(main(args.code_file, args.func_name, args.lang, args.verbose))

Locate the fault binary
Extract the function block
Take the snippet from the tool's response

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Function performance improvements suggestions

Scoring the performance of a single function will help you understand how much attention it requires.

Evaluate function reliability

Tool "fault_suggest_better_function_performance"

Payload

{
    "snippet": "",
    "lang": "",
    "score": 0.3,
    "target_score": 0.8
}

The snippet argument is the full code of a function or any code snippet. The lang hints the language of the snippet. The score is the current scoring of the snippet. This can be extracted using the score.performance tool or set arbitrarily. The target_score is where you want to code to be.

Returns

A unified diff markdown code-block containing potential changes.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

Here is a full working example to calling this tool:

import asyncio
import json
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(code_file: str, func_name: str, lang: str, target_score: float, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(
            "fault_extract_code_block", {
                "file": code_file,
                "func": func_name
            })

        snippet = json.loads(p[0].text)["full"]  # (1)!

        p = await client.call_tool(
            "fault_score_performance", {
                "snippet": snippet,
                "lang": lang
            })

        score = json.loads(p[0].text)["score"]  # (2)!

        p = await client.call_tool(
            "fault_suggest_better_function_performance", {
                "snippet": snippet,
                "lang": lang,
                "score": score,
                "target_score": target_score
            })

        print(p)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("code_file")
    parser.add_argument("func_name")
    parser.add_argument("lang")
    parser.add_argument("target_score", type=float)
    args = parser.parse_args()

    asyncio.run(main(args.code_file, args.func_name, args.lang, args.target_score, args.verbose))

Retrieve the snippet from the agent's response
Retrieve the score from the agent's response

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Function reliability improvements suggestions

Scoring the reliability of a single function will help you understand how much attention it requires.

Evaluate function reliability

Tool "fault_suggest_better_function_reliability"

Payload

{
    "snippet": "",
    "lang": "",
    "score": 0.3,
    "target_score": 0.8
}

The snippet argument is the full code of a function or any code snippet. The lang hints the language of the snippet. The score is the current scoring of the snippet. This can be extracted using the score.performance tool or set arbitrarily. The target_score is where you want to code to be.

Returns

A unified diff markdown code-block containing potential changes.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

Here is a full working example to calling this tool:

import asyncio
import json
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(code_file: str, func_name: str, lang: str, target_score: float, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(
            "fault_extract_code_block", {
                "file": code_file,
                "func": func_name
            })

        snippet = json.loads(p[0].text)["full"]

        p = await client.call_tool(
            "fault_score_performance", {
                "snippet": snippet,
                "lang": lang
            })

        score = json.loads(p[0].text)["score"]

        p = await client.call_tool(
            "fault_suggest_better_function_reliability", {
                "snippet": snippet,
                "lang": lang,
                "score": score,
                "target_score": target_score
            })

        print(p)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("code_file")
    parser.add_argument("func_name")
    parser.add_argument("lang")
    parser.add_argument("target_score", type=float)
    args = parser.parse_args()

    asyncio.run(main(args.code_file, args.func_name, args.lang, args.target_score, args.verbose))

Retrieve the snippet from the agent's response
Retrieve the score from the agent's response

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Suggests SLO

Service Level Objects are powerful tools to pilot your user satisfaction. fault is able to suggest SLO definitions for a function.

Evaluate function reliability

Tool "fault_suggest_service_level_objectives_slo"

Payload

{
    "snippet": "",
    "lang": "",
}

The snippet argument is the full code of a function or any code snippet. The lang hints the language of the snippet.

Returns

A JSON array of SLO objects. Each object is made of the following properties:

type: the kind of SLO
title: a human readable title for the SLO
objective: the value in the [0, 100[ range
threshold: a value appropriate for the type of objective
unit: the unit for the threshold
window: the window for the objective
sli: an object made of two keys you can use to configure your platform

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

Output Example

Here is a full working example to calling this tool:

[
    {
        "type": "latency",
        "title": "95th percentile latency",
        "objective": 95.0,
        "explanation": "95th percentile request latency under threshold ensures responsive service",
        "threshold": 0.3,
        "unit": "s",
        "window": "300s",
        "sli": {
            "prometheus": "histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{handler=\"index\"}[5m])) by (le))",
            "gcp/cloudrun": {
                "displayName": "95th percentile latency - 5min",
                "goal": 0.95,
                "calendarPeriod": "NONE",
                "serviceLevelIndicator": {
                "windowsBased": {
                    "windowPeriod": "300s",
                    "goodTotalRatioThreshold": {
                    "basicSliPerformance": {
                        "latency": {
                        "threshold": "0.3s"
                        }
                    },
                    "threshold": 0.95
                    }
                }
                }
            }
        }
    },
    {
        "type": "availability",
        "title": "Successful request ratio",
        "objective": 99.9,
        "explanation": "Percentage of successful (2xx) responses to ensure uptime",
        "threshold": 99.9,
        "unit": "%",
        "window": "300s",
        "sli": {
            "prometheus": "sum(rate(http_requests_total{handler=\"index\",status=~\"2..\"}[5m]))/sum(rate(http_requests_total{handler=\"index\"}[5m]))*100",
            "gcp/cloudrun": {
                "displayName": "99.9% availability - 5min",
                "goal": 0.999,
                "calendarPeriod": "NONE",
                "serviceLevelIndicator": {
                "windowsBased": {
                    "windowPeriod": "300s",
                    "goodTotalRatioThreshold": {
                    "threshold": 0.999
                    }
                }
                }
            }
        }
    },
    {
        "type": "error",
        "title": "Error rate",
        "objective": 99.0,
        "explanation": "Ensure error responses remain below 1% to detect upstream issues",
        "threshold": 1.0,
        "unit": "%",
        "window": "300s",
        "sli": {
            "prometheus": "sum(rate(http_requests_total{handler=\"index\",status!~\"2..\"}[5m]))/sum(rate(http_requests_total{handler=\"index\"}[5m]))*100",
            "gcp/cloudrun": {
                "displayName": "99% error rate - 5min",
                "goal": 0.99,
                "calendarPeriod": "NONE",
                "serviceLevelIndicator": {
                "windowsBased": {
                    "windowPeriod": "300s",
                    "goodTotalRatioThreshold": {
                    "threshold": 0.99
                    }
                }
                }
            }
        }
    }
]

import asyncio
import json
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(code_file: str, func_name: str, lang: str, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(
            "fault_extract_code_block", {
                "file": code_file,
                "func": func_name
            })

        snippet = json.loads(p[0].text)["full"]  # (1)!

        p = await client.call_tool(
            "fault_suggest_service_level_objectives_slo", {
                "snippet": snippet,
                "lang": lang
            })

        print(p)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("code_file")
    parser.add_argument("func_name")
    parser.add_argument("lang")
    args = parser.parse_args()

    asyncio.run(main(args.code_file, args.func_name, args.lang, args.verbose))

Retrieve the snippet from the agent's response

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Full file code changes recommendations

fault may generate a unified diff of changed to apply to a full file in order to improve its performance and reliability.

Evaluate function reliability

Tool "fault_make_reliability_and_perf_changes"

Payload

{
    "file": "",
}

The file argument is the absolute path to a file to seek changes for.

Returns

A JSON object with the following properties:

score: a number between 0.0 (very unreliable) and 1.0 (very reliable) of the file.
explanation: a short summary of the main threats you found and suggested changes.
old: always the full content of the original file as-is.
new: the new file content.
dependencies: an array of dependencies that may be needed.
diff: the unified diff between the two file versions.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

Output Example

The output returns a score of 0.2 for the existing code and the following explanation for the changes:

The original code used a blocking HTTP call without timeouts, retries, error handling, or client reuse. It could hang indefinitely, overwhelm resources, and surface unhandled exceptions to clients. The new version uses an async shared HTTPX client with connection limits and timeouts, adds retry logic with exponential backoff (via tenacity), maps errors to proper HTTP responses, and ensures the client is closed on shutdown.

Next is a snippet of the generated diff:

--- app.py
+++ app.py
@@ -1,15 +1,67 @@
import os
+import logging
+from functools import lru_cache
+import httpx
+from fastapi import FastAPI, HTTPException
+from fastapi.responses import JSONResponse
+from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type

-import httpx
-from fastapi import FastAPI
+# Initialize logger
+logger = logging.getLogger(__name__)
+logging.basicConfig(level=logging.INFO)

UPSTREAM_URL = os.getenv("UPSTREAM_URL", "https://jsonplaceholder.typicode.com")

app = FastAPI()

+@lru_cache()
+def get_http_client() -> httpx.AsyncClient:
+    """
+    Create a shared Async HTTP client with connection limits and timeouts.
+    """
+    limits = httpx.Limits(max_connections=100, max_keepalive_connections=20)
+    timeout = httpx.Timeout(5.0, connect=2.0)
+    return httpx.AsyncClient(limits=limits, timeout=timeout)
+
+@retry(
+    retry=retry_if_exception_type(httpx.HTTPError),
+    stop=stop_after_attempt(3),
+    wait=wait_exponential(multiplier=1, min=1, max=10)
+)
+async def fetch_todo() -> dict:
+    """
+    Fetch the todo item with retry logic for transient errors.
+    Raises HTTPStatusError or RequestError on failure.
+    """
+    client = get_http_client()
+    url = f"{UPSTREAM_URL}/todos/1"
+    headers = {"Host": "jsonplaceholder.typicode.com"}
+    response = await client.get(url, headers=headers)
+    response.raise_for_status()
+    return response.json()
+
+@app.on_event("shutdown")
+async def shutdown_event():
+    """
+    Close the HTTP client on application shutdown.
+    """
+    client = get_http_client()
+    await client.aclose()

@app.get("/")
-def index():
-    return httpx.get(f"{UPSTREAM_URL}/todos/1", headers={
-        "Host": "jsonplaceholder.typicode.com"
-    }).json()
+async def index():
+    """
+    Endpoint to retrieve a todo item. Implements retries, timeouts, and error handling.
+    """
+    try:
+        data = await fetch_todo()
+        return JSONResponse(content=data)
+    except httpx.HTTPStatusError as exc:
+        logger.error("Upstream returned error %s: %s", exc.response.status_code, exc)
+        raise HTTPException(status_code=exc.response.status_code, detail=str(exc))
+    except httpx.RequestError as exc:
+        logger.error("Network error while fetching todo: %s", exc)
+        raise HTTPException(status_code=502, detail="Bad Gateway")
+    except Exception as exc:
+        logger.exception("Unexpected error: %s", exc)
+        raise HTTPException(status_code=500, detail="Internal Server Error")

e2e.py

import asyncio
import json
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging
from rich.console import Console

console = Console()


async def main(llm: str, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    env = {}

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")

    if llm == "openai":
        env["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
    elif llm == "openrouter":
        args.append("--llm-client")
        args.append("open-router")
        args.append("--llm-prompt-reasoning-model")
        args.append("google/gemma-3-27b-it")
        args.append("--llm-embed-model-dim")
        args.append("384")
        env["OPENROUTER_API_KEY"] = os.getenv("OPENROUTER_API_KEY")
    elif llm == "ollama":
        args.append("--llm-client")
        args.append("ollama")
        args.append("--llm-prompt-reasoning-model")
        args.append("gemma3:4b")
        args.append("--llm-embed-model")
        args.append("mxbai-embed-large")
        args.append("--llm-embed-model-dim")
        args.append("1024")

    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": env
            },
        }
    }

    async with Client(config) as client:
        await client.call_tool(
            "fault_index_source_code", {
                "source_dir": str(Path.cwd() / "app"),
                "lang": "python"
            })

        p = await client.call_tool(
            "fault_make_reliability_and_perf_changes", {
                "file": str(Path.cwd() / "app" / "app.py"),
            })

        r = json.loads(p[0].text)

        console.print(f"[purple]Score[/] {r['score']}")
        console.print(f"[purple]Explanation[/] {r['explanation']}")
        console.print(f"[purple]Dependencies[/] {r['dependencies']}")
        console.print(f"[purple]Proposed changes[/]\n{r['diff']}")


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--llm", choices=["openai", "openrouter", "ollama"])
    parser.add_argument("--verbose",action='store_true')
    args = parser.parse_args()

    asyncio.run(main(args.llm, args.verbose))

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Evaluate Blackhole Impact

Evaluate function reliability during a blackhole event

Tool "fault_run_blackhole_impact_scenario"

Payload

{
    "url": "",
    "method": "",
    "body": "",
    "direction": "",
    "side": "",
    "duration": "",
    "num_clients": "",
    "rps": "",
    "timeout": 10,
    "proxies": [],
}

The url argument is the absolute address of the endpoint to send traffic to. The direction and side on which to apply the blackhole. We suggest, "egress" and "server". The duration indicates how long to run this scenario for and the num_clients/rps declare traffic parameters such as how many clients and requests per second. Finally, the proxies argument is an array of TCP proxies if you want to apply the blackhole a remote call made by your endpoint rather than directly on your endpoint. The timeout argument, in seconds, is used by the client when communicating with the application.

Returns

A markdown report of the scenario.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

inject-blackhole.py

import asyncio
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(url: str, upstream: str, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(
            "fault_run_blackhole_impact_scenario", {
                "url": url,
                "method": "GET",
                "body": "",
                "duration": "10s",
                "direction": "egress",
                "side": "server",
                "num_clients": 1,
                "rps": 3,
                "timeout": 5,
                "proxies": [
                    f"34000={upstream}:443" # (1)!
                ]
            })

        print(p[0].text)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("url")
    parser.add_argument("upstream")
    args = parser.parse_args()

    asyncio.run(main(args.url, args.upstream, args.verbose))

Mapping the proxy address 0.0.0.0:34000 to forward traffic to the real upstream server on port 443.

In the case of our application above this would be called as follows:

python inject-blackhole.py http://localhost:9090 https://jsonplaceholder.typicode.com

The blackhole event will take place on the response coming back from the upstream server (https://jsonplaceholder.typicode.com).

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Evaluate Latency Impact

Evaluate function reliability during a latency event

Tool "fault_run_latency_impact_scenario"

Payload

{
    "url": "",
    "method": "",
    "body": "",
    "latency": 0,
    "deviation": 0,
    "per_read_write_op": false,
    "direction": "",
    "side": "",
    "duration": "",
    "num_clients": "",
    "rps": "",
    "timeout": 10,
    "proxies": [],
}

The url argument is the absolute address of the endpoint to send traffic to. The direction and side on which to apply the latency. The duration indicates how long to run this scenario for and the num_clients/rps declare traffic parameters such as how many clients and requests per second. Finally, the proxies argument is an array of TCP proxies if you want to apply the latency a remote call made by your endpoint rather than directly on your endpoint. The timeout argument, in seconds, is used by the client when communicating with the application.

Returns

A markdown report of the scenario.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

inject-latency.py

import asyncio
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(url: str, upstream: str, latency: float, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(
            "fault_run_latency_impact_scenario", {
                "url": url,
                "method": "GET",
                "body": "",
                "duration": "10s",
                "latency": latency,
                "per_read_write_op": False,
                "deviation": 0,
                "direction": "ingress",
                "side": "server",
                "num_clients": 1,
                "rps": 3,
                "timeout": 10,
                "proxies": [
                    f"34000={upstream}:443" # (1)!
                ]
            })

        print(p[0].text)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("url")
    parser.add_argument("upstream")
    parser.add_argument("latency", type=float)
    args = parser.parse_args()

    asyncio.run(main(args.url, args.upstream, args.latency, args.verbose))

Mapping the proxy address 0.0.0.0:34000 to forward traffic to the real upstream server on port 443.

In the case of our application above this would be called as follows:

python inject-latency.py http://localhost:9090 https://jsonplaceholder.typicode.com 300

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Evaluate Packet Loss Impact

Evaluate function reliability during a packet loss event

Tool "fault_run_packet_loss_impact_scenario"

Payload

{
    "url": "",
    "method": "",
    "body": "",
    "direction": "",
    "side": "",
    "duration": "",
    "num_clients": "",
    "rps": "",
    "timeout": 10,
    "proxies": [],
}

The url argument is the absolute address of the endpoint to send traffic to. The direction and side on which to apply the packet loss. The duration indicates how long to run this scenario for and the num_clients/rps declare traffic parameters such as how many clients and requests per second. Finally, the proxies argument is an array of TCP proxies if you want to apply the packet loss a remote call made by your endpoint rather than directly on your endpoint. The timeout argument, in seconds, is used by the client when communicating with the application.

Returns

A markdown report of the scenario.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

inject-packet-loss.py

import asyncio
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(url: str, upstream: str, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(
            "fault_run_packet_loss_impact_scenario", {
                "url": url,
                "method": "GET",
                "body": "",
                "duration": "10s",
                "direction": "egress",
                "side": "server",
                "num_clients": 1,
                "timeout": 10,
                "rps": 3,
                "proxies": [
                    f"34000={upstream}:443" # (1)!
                ]
            })

        print(p[0].text)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("url")
    parser.add_argument("upstream")
    args = parser.parse_args()

    asyncio.run(main(args.url, args.upstream, args.verbose))

Mapping the proxy address 0.0.0.0:34000 to forward traffic to the real upstream server on port 443.

In the case of our application above this would be called as follows:

python inject-packet-loss.py http://localhost:9090 https://jsonplaceholder.typicode.com

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Evaluate Bandwidth Impact

Evaluate function reliability during a bandwidth event

Tool "fault_run_bandwidth_impact_scenario"

Payload

{
    "url": "",
    "method": "",
    "body": "",
    "direction": "",
    "rate": 0,
    "unit": "bps",
    "side": "",
    "duration": "",
    "num_clients": "",
    "rps": "",
    "timeout": 10,
    "proxies": [],
}

The url argument is the absolute address of the endpoint to send traffic to. The direction and side on which to apply the bandwidth. The duration indicates how long to run this scenario for and the num_clients/rps declare traffic parameters such as how many clients and requests per second. Finally, the proxies argument is an array of TCP proxies if you want to apply the bandwidth a remote call made by your endpoint rather than directly on your endpoint. The timeout argument, in seconds, is used by the client when communicating with the application.

Returns

A markdown report of the scenario.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

inject-bandwidth.py

import asyncio
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(url: str, upstream: str, bandwidth: int, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(
            "fault_run_bandwidth_impact_scenario", {
                "url": url,
                "method": "GET",
                "body": "",
                "duration": "10s",
                "rate": bandwidth,
                "unit": "bps",
                "direction": "egress",
                "side": "server",
                "num_clients": 5,
                "rps": 2,
                "timeout": 10,
                "proxies": [
                    f"34000={upstream}:443" # (1)!
                ]
            })

        print(p[0].text)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("url")
    parser.add_argument("upstream")
    parser.add_argument("bandwidth", type=int)
    args = parser.parse_args()

    asyncio.run(main(args.url, args.upstream, args.bandwidth, args.verbose))

Mapping the proxy address 0.0.0.0:34000 to forward traffic to the real upstream server on port 443.

In the case of our application above this would be called as follows (reduced to 1024 bytes per second):

python inject-bandwidth.py http://localhost:9090 https://jsonplaceholder.typicode.com 1024

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Tool: Evaluate Jitter Impact

Evaluate function reliability during a jitter event

Tool "fault_run_jitter_impact_scenario"

Payload

{
    "url": "",
    "method": "",
    "body": "",
    "direction": "",
    "rate": 0,
    "unit": "bps",
    "side": "",
    "duration": "",
    "num_clients": "",
    "rps": "",
    "timeout": 10,
    "proxies": [],
}

The url argument is the absolute address of the endpoint to send traffic to. The direction and side on which to apply the bandwidth. The duration indicates how long to run this scenario for and the num_clients/rps declare traffic parameters such as how many clients and requests per second. Finally, the proxies argument is an array of TCP proxies if you want to apply the jitter a remote call made by your endpoint rather than directly on your endpoint. The timeout argument, in seconds, is used by the client when communicating with the application.

Returns

A markdown report of the scenario.

Requirements

A qdrant URL
The LLM of your choice, in this example we use OpenAI so you need to set the OPENAI_API_KEY environment variable

Output Example

Here is an report sample:

Scenarios Report

Start: 2025-06-27 14:14:36.689011165 UTC

End: 2025-06-27 14:14:47.020905358 UTC

Scenario: Evaluating runtime performance of http://localhost:9090 (items: 1)

🎯 `GET` http://localhost:9090 | Passed

Call:

Method: GET
Timeout: 10000ms
Headers: -
Body?: No

Strategy: load for 10s with 1 clients @ 3 RPS

Faults Applied:

Type	Timeline	Description
jitter	0% `xxxxxxxxxx` 100%	Jitter: ➡️🖧Amplitude: 150.00ms, Frequence 5.00Hz

Run Overview:

Num. Requests	Num. Errors	Min. Response Time	Max Response Time	Mean Latency (ms)	Expectation Failures	Total Time
31	0 (0.0%)	83.65	272.49	199.92	0	10 seconds and 329 ms

Latency Percentile	Latency (ms)	Num. Requests (% of total)
p25	133.61	8 (25.8%)
p50	199.92	16 (51.6%)
p75	235.69	24 (77.4%)
p95	269.28	31 (100.0%)
p99	272.49	31 (100.0%)

SLO	Pass?	Objective	Margin	Num. Requests Over Threshold (% of total)
99% @ 350ms	✅	99% < 350ms	Below by 77.5ms	0 (0.0%)
95% @ 200ms	❌	95% < 200ms	Above by 69.3ms	15 (48.4%)

inject-jitter.py

import asyncio
import os
import shutil
from pathlib import Path
from tempfile import gettempdir

from fastmcp import Client
from fastmcp.utilities.logging import configure_logging


async def main(url: str, upstream: str, amplitude: float, frequency: float, verbose: bool) -> None:
    fault_path = shutil.which("fault")
    if not fault_path:
        print("fault: command not found")
        return

    args = []
    if verbose:
        configure_logging("DEBUG")
        args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"]

    args.append("agent")
    args.append("tool")

    config = {
        "mcpServers": {
            "local": {
                "command": fault_path,
                "args": args,
                "env": {
                    "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY")
                }
            },
        }
    }

    async with Client(config) as client:
        p = await client.call_tool(
            "fault_run_jitter_impact_scenario", {
                "url": url,
                "method": "GET",
                "body": "",
                "duration": "10s",
                "amplitude": amplitude,
                "frequency": frequency,
                "direction": "ingress",
                "side": "server",
                "num_clients": 1,
                "rps": 3,
                "timeout": 10,
                "proxies": [
                    f"34000={upstream}:443" # (1)!
                ]
            })

        print(p[0].text)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--verbose",action='store_true')
    parser.add_argument("url")
    parser.add_argument("upstream")
    parser.add_argument("amplitude", type=float)
    parser.add_argument("frequency", type=float)
    args = parser.parse_args()

    asyncio.run(main(args.url, args.upstream, args.amplitude, args.frequency))

Mapping the proxy address 0.0.0.0:34000 to forward traffic to the real upstream server on port 443.

In the case of our application above this would be called as follows:

python inject-jitter.py http://localhost:9090 https://jsonplaceholder.typicode.com 50 3

Tip

You may see the logs from the fault call by setting --verbose:

tail -f /tmp/fault.log

Explore fault MCP Agent Tools

Overview

Tool: Source Code Indexing

Tool: List function names from a file

Tool: Extract function block

Tool: Function Performance Scoring

Tool: Function Reliability Scoring

Tool: Function performance improvements suggestions

Tool: Function reliability improvements suggestions

Tool: Suggests SLO

Tool: Full file code changes recommendations

Tool: Evaluate Blackhole Impact

Tool: Evaluate Latency Impact

Tool: Evaluate Packet Loss Impact

Tool: Evaluate Bandwidth Impact

Tool: Evaluate Jitter Impact

Scenarios Report

Scenario: Evaluating runtime performance of http://localhost:9090 (items: 1)

🎯 GET http://localhost:9090 | Passed

🎯 `GET` http://localhost:9090 | Passed