Explore fault MCP Agent Tools
This guide will take you through the MCP tools supported by fault agent.
Prerequisites
-
Install fault
If you haven’t installed fault yet, follow the installation instructions.
-
Get an OpenAI Key
For the purpose of the guide, we will be using OpenAI models. You need to create an API key. Then make sure the key is available for fault:
The agent works fine with Gemini, ollama and OpenRouter so you may switch to either. You want to have a look at an example below.
-
Install a local qdrant database
fault uses qdrant for its vector database. You can install a local, free, qdrant using docker:
While not used by every tools, we suggest you start one up to explore all of them.
-
Install the Python FastMCP library
fault does not need this library to work but to demonstrate the tools we support, we will be using FastMCP.
fault with Cursor
Below is an example of using fault AI tools in Cursor to help it make the generated code more production ready.
Overview
The fault AI agent is called as follows:
It supports the MCP specification.
Get a demo application
For the purpose of demonstration, we suggest you run the following server with a single endpoint:
import os
import httpx
from fastapi import FastAPI
UPSTREAM_URL = os.getenv("UPSTREAM_URL", "https://jsonplaceholder.typicode.com")
app = FastAPI()
@app.get("/")
def index():
return httpx.get(f"{UPSTREAM_URL}/todos/1", headers={
"Host": "jsonplaceholder.typicode.com"
}).json()
Install dependencies as follows:
Then run it as follows:
-
This will ensure the remote call made from the endpoint goes through
fault
.
Tool: Source Code Indexing
In order to get the best feedback from fault AI agent, it is a good to index locally your source code so it. When performing certain operations, fault will search it for the right documents to contextualize the query it performs.
-
Index your source code
Tool
"fault_index_source_code"
Payload
The
source_dir
argument is an absolute path to the top-level directory containing code files. Thelang
argument hints which files to process.Returns
The string
"done"
when complete. Any MCP error with a hint of what went wrong otherwise.Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
Here is a full working example to calling this tool:
import asyncio import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(source_dir: str, verbose: bool) -> None: fault_path = shutil.which("fault") # (1)! if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { # (2)! "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: # (3)! p = await client.call_tool( # (4)! "fault_index_source_code", { "source_dir": source_dir, # (5)! "lang": "python" # (6)! }) print(p) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("source_dir") args = parser.parse_args() asyncio.run(main(args.source_dir, args.verbose))
- Locate the
fault
binary - Prepare a stdio configuration to call the tool
- Setup a client that handles agent initialization
- Call the
source.index
tool from fault - Pass the absolute directory to the source code
- Hint of the language to index:
python
,go
,rust
,javascript
...
Tool: List function names from a file
Context is everything! Being able to focus on a given function helps making the right decisions.
-
List function names
Tool
"fault_list_function_names"
Payload
The
file
argument is an absolute path a source code file.Returns
An array with any found function names.
Requirements
none
Here is a full working example to calling this tool:
import asyncio import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(code_file: str, verbose: bool) -> None: fault_path = shutil.which("fault") # (1)! if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { # (2)! "mcpServers": { "local": { "command": fault_path, "args": args, }, } } async with Client(config) as client: # (3)! p = await client.call_tool( # (4)! "fault_list_function_names", { "file": code_file, # (5)! }) print(p) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("source_dir") args = parser.parse_args() asyncio.run(main(args.source_dir, args.verbose))
- Locate the
fault
binary - Prepare a stdio configuration to call the tool
- Setup a client that handles agent initialization
- Call the
source.index
tool from fault - Pass the absolute file to the source code
- Locate the
Tool: Extract function block
Extract the whole function block including its signatures and, for some languages, any decorators around the function.
-
Extract function block
Tool
"fault_extract_code_block"
Payload
The
file
argument is an absolute path a source code file. Thefunc
argument is the name of the function within that module.Returns
A JSON payload with two keys:
full
: the complete function block including its signaturebody
: the function body without its signature
Requirements
none
Here is a full working example to calling this tool:
import asyncio import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(code_file: str, func_name: str, verbose: bool) -> None: fault_path = shutil.which("fault") # (1)! if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { # (2)! "mcpServers": { "local": { "command": fault_path, "args": args, }, } } async with Client(config) as client: # (3)! p = await client.call_tool( # (4)! "fault_extract_code_block", { "file": code_file, # (5)! "func": func_name # (6)! }) print(p) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("source_dir") parser.add_argument("func_name") args = parser.parse_args() asyncio.run(main(args.source_dir, args.verbose))
- Locate the
fault
binary - Prepare a stdio configuration to call the tool
- Setup a client that handles agent initialization
- Call the
source.index
tool from fault - Pass the absolute file to the source code
- The name of the function
Tool: Function Performance Scoring
Scoring the performance of a function will help you understand how much attention requires this function.
-
Evaluate function performance
Tool
"fault_score_performance"
Payload
The
snippet
argument is the full code of a function or any code snippet. Thelang
hints the language of the snippet.Returns
A JSON payload with two keys:
explanation
: a short description of the reasonning for the scorescore
: the score as a number between0.0
and1.0
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
Here is a full working example to calling this tool:
import asyncio import json import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(code_file: str, func_name: str, lang: str, verbose: bool) -> None: fault_path = shutil.which("fault") # (1)! if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( # (2)! "fault_extract_code_block", { "file": code_file, "func": func_name }) snippet = json.loads(p[0].text)["full"] # (3)! p = await client.call_tool( "fault_score_performance", { "snippet": snippet, "lang": lang }) print(p) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("code_file") parser.add_argument("func_name") parser.add_argument("lang") args = parser.parse_args() asyncio.run(main(args.code_file, args.func_name, args.lang, args.verbose))
- Locate the
fault
binary - Extract the function block
- Take the snippet from the tool's response
Tool: Function Reliability Scoring
Scoring the reliability of a function will help you understand how much attention requires this function.
-
Evaluate function reliability
Tool
"fault_score_reliability"
Payload
The
snippet
argument is the full code of a function or any code snippet. Thelang
hints the language of the snippet.Returns
A JSON payload with two keys:
explanation
: a short description of the reasonning for the scorescore
: the score as a number between0.0
and1.0
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
Here is a full working example to calling this tool:
import asyncio import json import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(code_file: str, func_name: str, lang: str, verbose: bool) -> None: fault_path = shutil.which("fault") # (1)! if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( # (2)! "fault_extract_code_block", { "file": code_file, "func": func_name }) snippet = json.loads(p[0].text)["full"] # (3)! p = await client.call_tool( "fault_score_reliability", { "snippet": snippet, "lang": lang }) print(p) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("code_file") parser.add_argument("func_name") parser.add_argument("lang") args = parser.parse_args() asyncio.run(main(args.code_file, args.func_name, args.lang, args.verbose))
- Locate the
fault
binary - Extract the function block
- Take the snippet from the tool's response
Tool: Function performance improvements suggestions
Scoring the performance of a single function will help you understand how much attention it requires.
-
Evaluate function reliability
Tool
"fault_suggest_better_function_performance"
Payload
The
snippet
argument is the full code of a function or any code snippet. Thelang
hints the language of the snippet. Thescore
is the current scoring of the snippet. This can be extracted using thescore.performance
tool or set arbitrarily. Thetarget_score
is where you want to code to be.Returns
A unified diff markdown code-block containing potential changes.
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
Here is a full working example to calling this tool:
import asyncio import json import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(code_file: str, func_name: str, lang: str, target_score: float, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( "fault_extract_code_block", { "file": code_file, "func": func_name }) snippet = json.loads(p[0].text)["full"] # (1)! p = await client.call_tool( "fault_score_performance", { "snippet": snippet, "lang": lang }) score = json.loads(p[0].text)["score"] # (2)! p = await client.call_tool( "fault_suggest_better_function_performance", { "snippet": snippet, "lang": lang, "score": score, "target_score": target_score }) print(p) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("code_file") parser.add_argument("func_name") parser.add_argument("lang") parser.add_argument("target_score", type=float) args = parser.parse_args() asyncio.run(main(args.code_file, args.func_name, args.lang, args.target_score, args.verbose))
- Retrieve the snippet from the agent's response
- Retrieve the score from the agent's response
Tool: Function reliability improvements suggestions
Scoring the reliability of a single function will help you understand how much attention it requires.
-
Evaluate function reliability
Tool
"fault_suggest_better_function_reliability"
Payload
The
snippet
argument is the full code of a function or any code snippet. Thelang
hints the language of the snippet. Thescore
is the current scoring of the snippet. This can be extracted using thescore.performance
tool or set arbitrarily. Thetarget_score
is where you want to code to be.Returns
A unified diff markdown code-block containing potential changes.
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
Here is a full working example to calling this tool:
import asyncio import json import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(code_file: str, func_name: str, lang: str, target_score: float, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( "fault_extract_code_block", { "file": code_file, "func": func_name }) snippet = json.loads(p[0].text)["full"] p = await client.call_tool( "fault_score_performance", { "snippet": snippet, "lang": lang }) score = json.loads(p[0].text)["score"] p = await client.call_tool( "fault_suggest_better_function_reliability", { "snippet": snippet, "lang": lang, "score": score, "target_score": target_score }) print(p) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("code_file") parser.add_argument("func_name") parser.add_argument("lang") parser.add_argument("target_score", type=float) args = parser.parse_args() asyncio.run(main(args.code_file, args.func_name, args.lang, args.target_score, args.verbose))
- Retrieve the snippet from the agent's response
- Retrieve the score from the agent's response
Tool: Suggests SLO
Service Level Objects are powerful tools to pilot your user satisfaction. fault is able to suggest SLO definitions for a function.
-
Evaluate function reliability
Tool
"fault_suggest_service_level_objectives_slo"
Payload
The
snippet
argument is the full code of a function or any code snippet. Thelang
hints the language of the snippet.Returns
A JSON array of SLO objects. Each object is made of the following properties:
type
: the kind of SLOtitle
: a human readable title for the SLOobjective
: the value in the[0, 100[
rangethreshold
: a value appropriate for the type of objectiveunit
: the unit for the thresholdwindow
: the window for the objectivesli
: an object made of two keys you can use to configure your platform
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
Output Example
Here is a full working example to calling this tool:
[ { "type": "latency", "title": "95th percentile latency", "objective": 95.0, "explanation": "95th percentile request latency under threshold ensures responsive service", "threshold": 0.3, "unit": "s", "window": "300s", "sli": { "prometheus": "histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{handler=\"index\"}[5m])) by (le))", "gcp/cloudrun": { "displayName": "95th percentile latency - 5min", "goal": 0.95, "calendarPeriod": "NONE", "serviceLevelIndicator": { "windowsBased": { "windowPeriod": "300s", "goodTotalRatioThreshold": { "basicSliPerformance": { "latency": { "threshold": "0.3s" } }, "threshold": 0.95 } } } } } }, { "type": "availability", "title": "Successful request ratio", "objective": 99.9, "explanation": "Percentage of successful (2xx) responses to ensure uptime", "threshold": 99.9, "unit": "%", "window": "300s", "sli": { "prometheus": "sum(rate(http_requests_total{handler=\"index\",status=~\"2..\"}[5m]))/sum(rate(http_requests_total{handler=\"index\"}[5m]))*100", "gcp/cloudrun": { "displayName": "99.9% availability - 5min", "goal": 0.999, "calendarPeriod": "NONE", "serviceLevelIndicator": { "windowsBased": { "windowPeriod": "300s", "goodTotalRatioThreshold": { "threshold": 0.999 } } } } } }, { "type": "error", "title": "Error rate", "objective": 99.0, "explanation": "Ensure error responses remain below 1% to detect upstream issues", "threshold": 1.0, "unit": "%", "window": "300s", "sli": { "prometheus": "sum(rate(http_requests_total{handler=\"index\",status!~\"2..\"}[5m]))/sum(rate(http_requests_total{handler=\"index\"}[5m]))*100", "gcp/cloudrun": { "displayName": "99% error rate - 5min", "goal": 0.99, "calendarPeriod": "NONE", "serviceLevelIndicator": { "windowsBased": { "windowPeriod": "300s", "goodTotalRatioThreshold": { "threshold": 0.99 } } } } } } ]
import asyncio import json import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(code_file: str, func_name: str, lang: str, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( "fault_extract_code_block", { "file": code_file, "func": func_name }) snippet = json.loads(p[0].text)["full"] # (1)! p = await client.call_tool( "fault_suggest_service_level_objectives_slo", { "snippet": snippet, "lang": lang }) print(p) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("code_file") parser.add_argument("func_name") parser.add_argument("lang") args = parser.parse_args() asyncio.run(main(args.code_file, args.func_name, args.lang, args.verbose))
- Retrieve the snippet from the agent's response
Tool: Full file code changes recommendations
fault may generate a unified diff of changed to apply to a full file in order to improve its performance and reliability.
-
Evaluate function reliability
Tool
"fault_make_reliability_and_perf_changes"
Payload
The
file
argument is the absolute path to a file to seek changes for.Returns
A JSON object with the following properties:
score
: a number between 0.0 (very unreliable) and 1.0 (very reliable) of the file.explanation
: a short summary of the main threats you found and suggested changes.old
: always the full content of the original file as-is.new
: the new file content.dependencies
: an array of dependencies that may be needed.diff
: the unified diff between the two file versions.
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
Output Example
The output returns a score of
0.2
for the existing code and the following explanation for the changes:The original code used a blocking HTTP call without timeouts, retries, error handling, or client reuse. It could hang indefinitely, overwhelm resources, and surface unhandled exceptions to clients. The new version uses an async shared HTTPX client with connection limits and timeouts, adds retry logic with exponential backoff (via tenacity), maps errors to proper HTTP responses, and ensures the client is closed on shutdown.
Next is a snippet of the generated diff:
--- app.py +++ app.py @@ -1,15 +1,67 @@ import os +import logging +from functools import lru_cache +import httpx +from fastapi import FastAPI, HTTPException +from fastapi.responses import JSONResponse +from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type -import httpx -from fastapi import FastAPI +# Initialize logger +logger = logging.getLogger(__name__) +logging.basicConfig(level=logging.INFO) UPSTREAM_URL = os.getenv("UPSTREAM_URL", "https://jsonplaceholder.typicode.com") app = FastAPI() +@lru_cache() +def get_http_client() -> httpx.AsyncClient: + """ + Create a shared Async HTTP client with connection limits and timeouts. + """ + limits = httpx.Limits(max_connections=100, max_keepalive_connections=20) + timeout = httpx.Timeout(5.0, connect=2.0) + return httpx.AsyncClient(limits=limits, timeout=timeout) + +@retry( + retry=retry_if_exception_type(httpx.HTTPError), + stop=stop_after_attempt(3), + wait=wait_exponential(multiplier=1, min=1, max=10) +) +async def fetch_todo() -> dict: + """ + Fetch the todo item with retry logic for transient errors. + Raises HTTPStatusError or RequestError on failure. + """ + client = get_http_client() + url = f"{UPSTREAM_URL}/todos/1" + headers = {"Host": "jsonplaceholder.typicode.com"} + response = await client.get(url, headers=headers) + response.raise_for_status() + return response.json() + +@app.on_event("shutdown") +async def shutdown_event(): + """ + Close the HTTP client on application shutdown. + """ + client = get_http_client() + await client.aclose() @app.get("/") -def index(): - return httpx.get(f"{UPSTREAM_URL}/todos/1", headers={ - "Host": "jsonplaceholder.typicode.com" - }).json() +async def index(): + """ + Endpoint to retrieve a todo item. Implements retries, timeouts, and error handling. + """ + try: + data = await fetch_todo() + return JSONResponse(content=data) + except httpx.HTTPStatusError as exc: + logger.error("Upstream returned error %s: %s", exc.response.status_code, exc) + raise HTTPException(status_code=exc.response.status_code, detail=str(exc)) + except httpx.RequestError as exc: + logger.error("Network error while fetching todo: %s", exc) + raise HTTPException(status_code=502, detail="Bad Gateway") + except Exception as exc: + logger.exception("Unexpected error: %s", exc) + raise HTTPException(status_code=500, detail="Internal Server Error")
e2e.pyimport asyncio import json import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging from rich.console import Console console = Console() async def main(llm: str, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return env = {} args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") if llm == "openai": env["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") elif llm == "openrouter": args.append("--llm-client") args.append("open-router") args.append("--llm-prompt-reasoning-model") args.append("google/gemma-3-27b-it") args.append("--llm-embed-model-dim") args.append("384") env["OPENROUTER_API_KEY"] = os.getenv("OPENROUTER_API_KEY") elif llm == "ollama": args.append("--llm-client") args.append("ollama") args.append("--llm-prompt-reasoning-model") args.append("gemma3:4b") args.append("--llm-embed-model") args.append("mxbai-embed-large") args.append("--llm-embed-model-dim") args.append("1024") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": env }, } } async with Client(config) as client: await client.call_tool( "fault_index_source_code", { "source_dir": str(Path.cwd() / "app"), "lang": "python" }) p = await client.call_tool( "fault_make_reliability_and_perf_changes", { "file": str(Path.cwd() / "app" / "app.py"), }) r = json.loads(p[0].text) console.print(f"[purple]Score[/] {r['score']}") console.print(f"[purple]Explanation[/] {r['explanation']}") console.print(f"[purple]Dependencies[/] {r['dependencies']}") console.print(f"[purple]Proposed changes[/]\n{r['diff']}") if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--llm", choices=["openai", "openrouter", "ollama"]) parser.add_argument("--verbose",action='store_true') args = parser.parse_args() asyncio.run(main(args.llm, args.verbose))
Tool: Evaluate Blackhole Impact
-
Evaluate function reliability during a blackhole event
Tool
"fault_run_blackhole_impact_scenario"
Payload
{ "url": "", "method": "", "body": "", "direction": "", "side": "", "duration": "", "num_clients": "", "rps": "", "timeout": 10, "proxies": [], }
The
url
argument is the absolute address of the endpoint to send traffic to. Thedirection
andside
on which to apply the blackhole. We suggest,"egress"
and"server"
. Theduration
indicates how long to run this scenario for and thenum_clients
/rps
declare traffic parameters such as how many clients and requests per second. Finally, theproxies
argument is an array of TCP proxies if you want to apply the blackhole a remote call made by your endpoint rather than directly on your endpoint. Thetimeout
argument, in seconds, is used by the client when communicating with the application.Returns
A markdown report of the scenario.
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
inject-blackhole.pyimport asyncio import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(url: str, upstream: str, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( "fault_run_blackhole_impact_scenario", { "url": url, "method": "GET", "body": "", "duration": "10s", "direction": "egress", "side": "server", "num_clients": 1, "rps": 3, "timeout": 5, "proxies": [ f"34000={upstream}:443" # (1)! ] }) print(p[0].text) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("url") parser.add_argument("upstream") args = parser.parse_args() asyncio.run(main(args.url, args.upstream, args.verbose))
- Mapping the proxy address
0.0.0.0:34000
to forward traffic to the real upstream server on port 443.
In the case of our application above this would be called as follows:
The blackhole event will take place on the response coming back from the
upstream server (https://jsonplaceholder.typicode.com
).
Tool: Evaluate Latency Impact
-
Evaluate function reliability during a latency event
Tool
"fault_run_latency_impact_scenario"
Payload
{ "url": "", "method": "", "body": "", "latency": 0, "deviation": 0, "per_read_write_op": false, "direction": "", "side": "", "duration": "", "num_clients": "", "rps": "", "timeout": 10, "proxies": [], }
The
url
argument is the absolute address of the endpoint to send traffic to. Thedirection
andside
on which to apply the latency. Theduration
indicates how long to run this scenario for and thenum_clients
/rps
declare traffic parameters such as how many clients and requests per second. Finally, theproxies
argument is an array of TCP proxies if you want to apply the latency a remote call made by your endpoint rather than directly on your endpoint. Thetimeout
argument, in seconds, is used by the client when communicating with the application.Returns
A markdown report of the scenario.
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
inject-latency.pyimport asyncio import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(url: str, upstream: str, latency: float, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( "fault_run_latency_impact_scenario", { "url": url, "method": "GET", "body": "", "duration": "10s", "latency": latency, "per_read_write_op": False, "deviation": 0, "direction": "ingress", "side": "server", "num_clients": 1, "rps": 3, "timeout": 10, "proxies": [ f"34000={upstream}:443" # (1)! ] }) print(p[0].text) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("url") parser.add_argument("upstream") parser.add_argument("latency", type=float) args = parser.parse_args() asyncio.run(main(args.url, args.upstream, args.latency, args.verbose))
- Mapping the proxy address
0.0.0.0:34000
to forward traffic to the real upstream server on port 443.
In the case of our application above this would be called as follows:
Tool: Evaluate Packet Loss Impact
-
Evaluate function reliability during a packet loss event
Tool
"fault_run_packet_loss_impact_scenario"
Payload
{ "url": "", "method": "", "body": "", "direction": "", "side": "", "duration": "", "num_clients": "", "rps": "", "timeout": 10, "proxies": [], }
The
url
argument is the absolute address of the endpoint to send traffic to. Thedirection
andside
on which to apply the packet loss. Theduration
indicates how long to run this scenario for and thenum_clients
/rps
declare traffic parameters such as how many clients and requests per second. Finally, theproxies
argument is an array of TCP proxies if you want to apply the packet loss a remote call made by your endpoint rather than directly on your endpoint. Thetimeout
argument, in seconds, is used by the client when communicating with the application.Returns
A markdown report of the scenario.
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
inject-packet-loss.pyimport asyncio import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(url: str, upstream: str, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( "fault_run_packet_loss_impact_scenario", { "url": url, "method": "GET", "body": "", "duration": "10s", "direction": "egress", "side": "server", "num_clients": 1, "timeout": 10, "rps": 3, "proxies": [ f"34000={upstream}:443" # (1)! ] }) print(p[0].text) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("url") parser.add_argument("upstream") args = parser.parse_args() asyncio.run(main(args.url, args.upstream, args.verbose))
- Mapping the proxy address
0.0.0.0:34000
to forward traffic to the real upstream server on port 443.
In the case of our application above this would be called as follows:
Tool: Evaluate Bandwidth Impact
-
Evaluate function reliability during a bandwidth event
Tool
"fault_run_bandwidth_impact_scenario"
Payload
{ "url": "", "method": "", "body": "", "direction": "", "rate": 0, "unit": "bps", "side": "", "duration": "", "num_clients": "", "rps": "", "timeout": 10, "proxies": [], }
The
url
argument is the absolute address of the endpoint to send traffic to. Thedirection
andside
on which to apply the bandwidth. Theduration
indicates how long to run this scenario for and thenum_clients
/rps
declare traffic parameters such as how many clients and requests per second. Finally, theproxies
argument is an array of TCP proxies if you want to apply the bandwidth a remote call made by your endpoint rather than directly on your endpoint. Thetimeout
argument, in seconds, is used by the client when communicating with the application.Returns
A markdown report of the scenario.
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
inject-bandwidth.pyimport asyncio import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(url: str, upstream: str, bandwidth: int, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( "fault_run_bandwidth_impact_scenario", { "url": url, "method": "GET", "body": "", "duration": "10s", "rate": bandwidth, "unit": "bps", "direction": "egress", "side": "server", "num_clients": 5, "rps": 2, "timeout": 10, "proxies": [ f"34000={upstream}:443" # (1)! ] }) print(p[0].text) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("url") parser.add_argument("upstream") parser.add_argument("bandwidth", type=int) args = parser.parse_args() asyncio.run(main(args.url, args.upstream, args.bandwidth, args.verbose))
- Mapping the proxy address
0.0.0.0:34000
to forward traffic to the real upstream server on port 443.
In the case of our application above this would be called as follows (reduced to 1024 bytes per second):
Tool: Evaluate Jitter Impact
-
Evaluate function reliability during a jitter event
Tool
"fault_run_jitter_impact_scenario"
Payload
{ "url": "", "method": "", "body": "", "direction": "", "rate": 0, "unit": "bps", "side": "", "duration": "", "num_clients": "", "rps": "", "timeout": 10, "proxies": [], }
The
url
argument is the absolute address of the endpoint to send traffic to. Thedirection
andside
on which to apply the bandwidth. Theduration
indicates how long to run this scenario for and thenum_clients
/rps
declare traffic parameters such as how many clients and requests per second. Finally, theproxies
argument is an array of TCP proxies if you want to apply the jitter a remote call made by your endpoint rather than directly on your endpoint. Thetimeout
argument, in seconds, is used by the client when communicating with the application.Returns
A markdown report of the scenario.
Requirements
- A qdrant URL
- The LLM of your choice, in this example we use OpenAI so you need to
set the
OPENAI_API_KEY
environment variable
Output Example
Here is an report sample:
Scenarios Report
Start: 2025-06-27 14:14:36.689011165 UTC
End: 2025-06-27 14:14:47.020905358 UTC
Scenario: Evaluating runtime performance of http://localhost:9090 (items: 1)
🎯
GET
http://localhost:9090 | PassedCall:
- Method:
GET
- Timeout: 10000ms
- Headers: -
- Body?: No
Strategy: load for 10s with 1 clients @ 3 RPS
Faults Applied:
Type Timeline Description jitter 0% xxxxxxxxxx
100%Jitter: ➡️🖧Amplitude: 150.00ms, Frequence 5.00Hz Run Overview:
Num. Requests Num. Errors Min. Response Time Max Response Time Mean Latency (ms) Expectation Failures Total Time 31 0 (0.0%) 83.65 272.49 199.92 0 10 seconds and 329 ms Latency Percentile Latency (ms) Num. Requests (% of total) p25 133.61 8 (25.8%) p50 199.92 16 (51.6%) p75 235.69 24 (77.4%) p95 269.28 31 (100.0%) p99 272.49 31 (100.0%) SLO Pass? Objective Margin Num. Requests Over Threshold (% of total) 99% @ 350ms ✅ 99% < 350ms Below by 77.5ms 0 (0.0%) 95% @ 200ms ❌ 95% < 200ms Above by 69.3ms 15 (48.4%)
inject-jitter.pyimport asyncio import os import shutil from pathlib import Path from tempfile import gettempdir from fastmcp import Client from fastmcp.utilities.logging import configure_logging async def main(url: str, upstream: str, amplitude: float, frequency: float, verbose: bool) -> None: fault_path = shutil.which("fault") if not fault_path: print("fault: command not found") return args = [] if verbose: configure_logging("DEBUG") args = ["--log-file", str(Path(gettempdir()) / "fault.log"), "--log-level", "debug"] args.append("agent") args.append("tool") config = { "mcpServers": { "local": { "command": fault_path, "args": args, "env": { "OPENAI_API_KEY": os.getenv("OPENAI_API_KEY") } }, } } async with Client(config) as client: p = await client.call_tool( "fault_run_jitter_impact_scenario", { "url": url, "method": "GET", "body": "", "duration": "10s", "amplitude": amplitude, "frequency": frequency, "direction": "ingress", "side": "server", "num_clients": 1, "rps": 3, "timeout": 10, "proxies": [ f"34000={upstream}:443" # (1)! ] }) print(p[0].text) if __name__ == "__main__": import argparse parser = argparse.ArgumentParser() parser.add_argument("--verbose",action='store_true') parser.add_argument("url") parser.add_argument("upstream") parser.add_argument("amplitude", type=float) parser.add_argument("frequency", type=float) args = parser.parse_args() asyncio.run(main(args.url, args.upstream, args.amplitude, args.frequency))
- Mapping the proxy address
0.0.0.0:34000
to forward traffic to the real upstream server on port 443.
In the case of our application above this would be called as follows: