NetOpsBench
Fault Extension

Custom Faults

Built-in fault labels and the supported SDK path for adding project-local faults.

Fault extension changes the benchmark fault surface. Most agent work should use the built-in fault set first; add custom faults when the research question requires a new failure mode.

Built-in fault catalog

Scenario YAML references faults by fault_type.

Fault typeCategoryRequires interfaceNotes
link_downLinkyesBrings the target interface down.
link_flappingLinknoRepeatedly toggles a link. Parameters: iterations, down_time, up_time.
mtu_mismatchImpairmentyesChanges interface MTU to break path MTU.
packet_lossImpairmentnoAdds tc-based loss on the target device.
packet_corruptionImpairmentnoAdds tc-based corruption.
high_latencyImpairmentnoAdds tc-based delay. Parameter: latency_ms.
bgp_neighbor_misconfigRouting (BGP)noMisconfigures a BGP neighbor.
static_route_misconfigRouting (Static)noInstalls an incorrect static route. Alias: static_route_misconfiguration.
blackhole_routeRouting (Static)noInstalls a route pointing to a null next-hop. Requires target_prefix.
route_policy_misconfigRouting (Policy)noMisconfigures a prefix list or route-map.
device_downSystemnoBrings the target device offline.
acl_misconfigACLnoAdds an ACL that blocks expected traffic. Alias: acl_misconfiguration.

Common scenario fields:

  • target_device for the device where the fault is applied;
  • target_interface for interface-scoped faults such as link_down and mtu_mismatch;
  • target_prefix for blackhole_route;
  • parameters for fault-specific values;
  • metadata for descriptive benchmark annotations.

The authoritative built-in specs live under netopsbench/platform/faults/builtin/; injection and recovery handlers live under netopsbench/platform/faults/handlers/.

Simple custom fault

Use simple_fault() for project-local extensions:

from netopsbench.sdk import NetOpsBench, FaultContext, simple_fault

def inject(ctx: FaultContext) -> dict:
    return {"success": True, "device": ctx.target_device}

def recover(ctx: FaultContext) -> dict:
    return {"success": True}

my_pack = simple_fault(
    "my_latency",
    inject=inject,
    recover=recover,
    requires_interface=True,
    required_parameters=("delay_ms",),
)

with NetOpsBench(workspace=".") as bench:
    bench.faults.register_pack(my_pack)

Or register one fault directly:

bench.faults.register_fault(
    "my_latency",
    inject,
    recover,
    requires_interface=True,
    required_parameters=("delay_ms",),
)

Once registered, reference it from scenario YAML with fault_type: my_latency.

Function contract

TypeNotes
ArgumentFaultContextfault_type, target_device, target_interface, parameters, metadata.
Returndict or FaultExecutionResultShould include "success": bool; omitted success defaults to success.

Recovery should be safe to call more than once when possible and should clean up partial state explicitly.

Runnable example

examples/04_custom_faults.py registers examples/faults/custom_fault_pack/ and runs examples/faults/custom_fault_pack/scenario.yaml:

PYTHONPATH=. python examples/04_custom_faults.py --vendor openai --repo-root .

The example registers a custom latency fault:

fault_pack = simple_fault(
    "demo_custom_latency",
    inject=inject,
    recover=recover,
    requires_interface=True,
    required_parameters=("delay_ms",),
    aliases=["demo_custom_delay"],
)

Use this example to verify the extension mechanism. A research-quality fault pack should also validate parameters, define recovery behavior clearly, and test malformed input plus idempotent cleanup.

Class-based packs

Use the class-based API when multiple related faults share state:

from dataclasses import dataclass
from typing import Any
from netopsbench.sdk import FaultContext, FaultExecutionResult, FaultSpec

class MyExecutor:
    def inject(self, ctx: FaultContext) -> FaultExecutionResult:
        return FaultExecutionResult(fault_type=ctx.fault_type, success=True)

    def recover(self, ctx: FaultContext) -> FaultExecutionResult:
        return FaultExecutionResult(fault_type=ctx.fault_type, success=True)

@dataclass(frozen=True)
class MyFaultPack:
    name: str = "my_fault_pack"
    version: str = "1.0"

    def register(self, registry: Any) -> None:
        registry.register(
            spec=FaultSpec(name="my_latency", requires_interface=True),
            executor=MyExecutor(),
        )

External extensions should use netopsbench.sdk.faults and avoid depending on netopsbench.platform.faults.injector.FaultInjector, which is an internal implementation detail.