NetOpsBench
Architecture

System Overview

Runtime loop, evidence path, worker isolation, and report aggregation.

A benchmark run is a closed loop: resolve a scenario, provision a topology, inject a fault or healthy episode, collect evidence, call the agent, and score the returned DiagnosisResult.

NetOpsBench pipeline architecture

Runtime loop

StageWhat happens
Scenario and topologyA generated scenario selects topology scale, traffic profile, fault type, target device, and target interface when applicable.
Runtime provisioningThe SDK provisions a Linux / Containerlab runtime and starts observability services.
Fault episodeThe scenario executor applies traffic generation, injects the selected fault, waits through the observation window, and records symptoms.
Diagnosis contextThe platform assembles topology, symptoms, Pingmesh summaries, runtime metadata, and optional tools into DiagnosticContext.
Agent callThe agent investigates evidence and returns DiagnosisResult.
Scoring and reportThe evaluator compares verdict, fault type, device, and interface against ground truth and writes reports plus artifacts.

Pingmesh is the first symptom source for many cases. It exposes path-level loss, latency, and reachability changes before deeper tool use. Agents can correlate that signal with topology, BGP state, routes, interface counters, syslog, and connectivity checks.

Evidence available to agents

Evidence sourceTypical use
Topology metadataIdentify spines, leaves, clients, links, racks, and candidate paths.
Pingmesh summariesDetect unhealthy source-destination pairs and affected path domains.
BGP state and routesSeparate control-plane failures from data-plane impairments.
Interface state and countersLocalize down links, drops, errors, congestion, or MTU symptoms.
Syslog and telemetryCorrelate operational events with the scenario window.
Connectivity checksValidate current reachability and path selection.

Worker-pool execution

Suite runs use a worker pool when workers=N. Each worker owns a separate lab and telemetry namespace so concurrent scenarios do not mix symptoms.

Parallel benchmark architecture for DCN troubleshooting agents

StepWhat happens
Submit runThe user passes an agent plus one scenario or a suite into NetOpsBench(...).sessions.
Start sessionThe session manager resolves scenarios, infers scale, creates a run id, and prepares artifact paths.
Build runtime poolThe runtime manager creates workers with separate lab names, topology directories, management subnets, buckets, logs, and raw result paths.
Deploy workersEach worker provisions an independent Containerlab / SONiC topology and starts worker-local telemetry against shared core observability services.
Schedule scenariosThe dispatcher assigns scenarios across workers and runs them in parallel when more than one worker is available.
Execute and scoreEach worker runs the episode, builds context, calls the agent, records raw output, and scores the case.
Aggregate reportThe dispatcher merges worker summaries, scenario summaries, raw artifact paths, and aggregate metrics into one BenchmarkReport.

Shared and worker-local state

LayerComponents
SharedSDK entrypoint, session orchestration, scenario assignment, final report aggregation, shared observability services.
Worker-localContainerlab topology, SONiC nodes, management subnet, fault episode, Pingmesh / Telegraf data, InfluxDB bucket, raw outputs, worker logs.

Evaluators are worker-local during concurrent execution. Final report generation uses a separate aggregation evaluator after worker results are merged in deterministic worker-index order.

SDK mapping

SDK callArchitecture path
run_scenario(...)One worker, one scenario, one report.
run_suite(..., workers=N)Scenarios distributed across isolated workers and merged into one report.
run_on_runtime_suite(...)Same execution path, but the caller owns the already-provisioned runtime pool.

Use the single-scenario path for agent contract validation. Use suite or scale runs when collecting benchmark-level evidence.