Custom Faults
Built-in fault labels and the supported SDK path for adding project-local faults.
Fault extension changes the benchmark fault surface. Most agent work should use the built-in fault set first; add custom faults when the research question requires a new failure mode.
Built-in fault catalog
Scenario YAML references faults by fault_type.
| Fault type | Category | Requires interface | Notes |
|---|---|---|---|
link_down | Link | yes | Brings the target interface down. |
link_flapping | Link | no | Repeatedly toggles a link. Parameters: iterations, down_time, up_time. |
mtu_mismatch | Impairment | yes | Changes interface MTU to break path MTU. |
packet_loss | Impairment | no | Adds tc-based loss on the target device. |
packet_corruption | Impairment | no | Adds tc-based corruption. |
high_latency | Impairment | no | Adds tc-based delay. Parameter: latency_ms. |
bgp_neighbor_misconfig | Routing (BGP) | no | Misconfigures a BGP neighbor. |
static_route_misconfig | Routing (Static) | no | Installs an incorrect static route. Alias: static_route_misconfiguration. |
blackhole_route | Routing (Static) | no | Installs a route pointing to a null next-hop. Requires target_prefix. |
route_policy_misconfig | Routing (Policy) | no | Misconfigures a prefix list or route-map. |
device_down | System | no | Brings the target device offline. |
acl_misconfig | ACL | no | Adds an ACL that blocks expected traffic. Alias: acl_misconfiguration. |
Common scenario fields:
target_devicefor the device where the fault is applied;target_interfacefor interface-scoped faults such aslink_downandmtu_mismatch;target_prefixforblackhole_route;parametersfor fault-specific values;metadatafor descriptive benchmark annotations.
The authoritative built-in specs live under netopsbench/platform/faults/builtin/; injection and recovery handlers live under netopsbench/platform/faults/handlers/.
Simple custom fault
Use simple_fault() for project-local extensions:
from netopsbench.sdk import NetOpsBench, FaultContext, simple_fault
def inject(ctx: FaultContext) -> dict:
return {"success": True, "device": ctx.target_device}
def recover(ctx: FaultContext) -> dict:
return {"success": True}
my_pack = simple_fault(
"my_latency",
inject=inject,
recover=recover,
requires_interface=True,
required_parameters=("delay_ms",),
)
with NetOpsBench(workspace=".") as bench:
bench.faults.register_pack(my_pack)Or register one fault directly:
bench.faults.register_fault(
"my_latency",
inject,
recover,
requires_interface=True,
required_parameters=("delay_ms",),
)Once registered, reference it from scenario YAML with fault_type: my_latency.
Function contract
| Type | Notes | |
|---|---|---|
| Argument | FaultContext | fault_type, target_device, target_interface, parameters, metadata. |
| Return | dict or FaultExecutionResult | Should include "success": bool; omitted success defaults to success. |
Recovery should be safe to call more than once when possible and should clean up partial state explicitly.
Runnable example
examples/04_custom_faults.py registers examples/faults/custom_fault_pack/ and runs examples/faults/custom_fault_pack/scenario.yaml:
PYTHONPATH=. python examples/04_custom_faults.py --vendor openai --repo-root .The example registers a custom latency fault:
fault_pack = simple_fault(
"demo_custom_latency",
inject=inject,
recover=recover,
requires_interface=True,
required_parameters=("delay_ms",),
aliases=["demo_custom_delay"],
)Use this example to verify the extension mechanism. A research-quality fault pack should also validate parameters, define recovery behavior clearly, and test malformed input plus idempotent cleanup.
Class-based packs
Use the class-based API when multiple related faults share state:
from dataclasses import dataclass
from typing import Any
from netopsbench.sdk import FaultContext, FaultExecutionResult, FaultSpec
class MyExecutor:
def inject(self, ctx: FaultContext) -> FaultExecutionResult:
return FaultExecutionResult(fault_type=ctx.fault_type, success=True)
def recover(self, ctx: FaultContext) -> FaultExecutionResult:
return FaultExecutionResult(fault_type=ctx.fault_type, success=True)
@dataclass(frozen=True)
class MyFaultPack:
name: str = "my_fault_pack"
version: str = "1.0"
def register(self, registry: Any) -> None:
registry.register(
spec=FaultSpec(name="my_latency", requires_interface=True),
executor=MyExecutor(),
)External extensions should use netopsbench.sdk.faults and avoid depending on netopsbench.platform.faults.injector.FaultInjector, which is an internal implementation detail.