Skip to content

Latest commit

 

History

History

README.md

CrossCheck Examples

This directory contains example scripts demonstrating how to use CrossCheck for network counter repair and validation.

Files

  • simple_example.py: Basic end-to-end example with a 3-node synthetic network
  • abilene_example.py: Real-world example using Abilene network data (12 nodes)
  • generate_sample_data.py: Script used to generate the synthetic sample data
  • generate_abilene_subset.py: Script used to extract Abilene data subset
  • sample_data/: Directory containing sample topology and telemetry data

Running the Example

From this directory, run:

uv run python3 simple_example.py

Or if not using uv:

python3 simple_example.py

Expected output (approximate - results may vary slightly between runs):

======================================================================
CrossCheck: Network Counter Repair and Validation Example
======================================================================

✓ Loaded topology: 3 nodes
  Nodes: ['NodeA', 'NodeB', 'NodeC']

✓ Loaded telemetry data: 5 snapshots

✓ CrossCheck initialized with 3 nodes

Using paths from: sample_data/paths/
Running repair and validation pipeline...
----------------------------------------------------------------------
CrossCheck processing 5 rows...
Step 1: Repairing data with Democratic Trust Propagation...
Repairing rows: 100%|██████████| 5/5 [00:00<00:00, 17.26it/s]
✓ Repaired 5 rows
Step 2: Validating repaired data...
Validating rows: 100%|██████████| 5/5 [00:00<00:00, 14.33it/s]
✓ Validated 5 rows
✓ CrossCheck pipeline complete! Processed 5 rows
----------------------------------------------------------------------

======================================================================
Results Summary
======================================================================
Input snapshots:  5
Output snapshots: 5

Average repair confidence: 0.49-0.52
  (Higher is better; >0.5 is good)
Average validation confidence: 0.69-0.80
  (Fraction of counters passing validation)
Average error: 7.5% → 3.8-4.8% (36-49% improvement)

Sample results from first snapshot:
----------------------------------------------------------------------
Timestamp: 2024/01/01 12:00 UTC
Repair type: DTP
Repair confidence: 0.43-0.51
Validation type: w/ paths
Validation confidence: 0.86-1.00

Example repaired counter (NodeA → NodeB):
  Ground truth: 250.00
  Perturbed:    228.49
  Corrected:    230-234
  Confidence:   0.31-0.36

======================================================================
✓ Example complete!
======================================================================

What the Example Does

  1. Loads sample data: 3-node linear network topology (A-B-C) and 5 snapshots of telemetry with intentionally perturbed counter values (~10% error)

  2. Loads paths: Routing paths that show how traffic flows through the network (e.g., A→C traffic goes via B)

  3. Configures CrossCheck: Sets up repair with 20 trials and validation with 7% tolerance

  4. Runs the pipeline: Repairs inconsistent counters using DTP with path-based predictions, then validates against demand invariants

  5. Shows results: Displays repair and validation confidence scores, demonstrating 36-49% error reduction (7.5% → 3.8-4.8%)

Customizing the Example

You can modify the configuration in simple_example.py:

config = CrossCheckConfig(
    repair_config=RepairConfig(
        num_trials=20,              # Increase for better accuracy
        similarity_threshold=0.05,  # Adjust fuzzy matching tolerance
        seed=42                     # Change for different randomization
    ),
    validator_config=ValidatorConfig(
        threshold=0.03,             # Adjust validation tolerance
        counter_bias_correction=0.024  # Adjust bias correction
    )
)

Using with Your Own Data

To use CrossCheck with your own network data:

  1. Prepare topology: Create a JSON file following the format in sample_data/topology.json

  2. Prepare telemetry: Create a pandas DataFrame with the required column format (see sample_data/README.md)

  3. Adapt the example: Modify simple_example.py to load your data files

  4. Run and tune: Experiment with configuration parameters to optimize for your network

See the main README.md for detailed API documentation.


Abilene Network Example

The Abilene example demonstrates CrossCheck on real network telemetry from the Abilene academic backbone network.

Running the Example

From this directory, run:

uv run python3 abilene_example.py

Or if not using uv:

python3 abilene_example.py

Expected output (approximate - results may vary slightly):

======================================================================
CrossCheck: Abilene Network Real Data Example
======================================================================

✓ Loaded 100 snapshots from Abilene network
✓ Loaded topology: 12 nodes, 15 links

Simulating measurement errors...
✓ Perturbed 25% of counters with ±15% error

✓ CrossCheck initialized with 12 nodes

Running CrossCheck repair and validation pipeline...
----------------------------------------------------------------------
CrossCheck processing 100 rows...
Step 1: Repairing data with Democratic Trust Propagation...
Repairing rows: 100%|██████████| 100/100 [00:03<00:00, 30.28it/s]
✓ Repaired 100 rows
Step 2: Validating repaired data...
Validating rows: 100%|██████████| 100/100 [00:00<00:00, 181.79it/s]
✓ Validated 100 rows
✓ CrossCheck pipeline complete! Processed 100 rows
----------------------------------------------------------------------

======================================================================
Results Summary
======================================================================
Network: Abilene (12 nodes, 15 links)
Snapshots processed: 100

Average repair confidence: 0.80-0.82
  (Higher is better; >0.5 is good)
Average validation confidence: 0.98-0.99
  (Fraction of counters passing validation)
Average error: 1.9% → 1.1% (40-42% improvement)
  (Evaluated on 8400 counter values)

Sample results from first snapshot:
----------------------------------------------------------------------
Timestamp: 2004/03/01 00:00 UTC
Repair confidence: 0.775
Validation confidence: 1.000

Example repaired counter (ATLAM5 → ATLAng egress):
  Ground truth: 9.171
  Perturbed:    9.288
  Corrected:    9.263
  Confidence:   0.800

======================================================================
✓ Abilene example complete!
======================================================================

What the Example Does

  1. Loads Abilene data: 100 snapshots from the real Abilene academic backbone network (12 nodes, 15 links) covering March 1-5, 2004

  2. Simulates errors: Perturbs 25% of interface counters with ±15% random scaling to simulate realistic measurement errors

  3. Configures CrossCheck: Uses 30 repair trials and 7% validation tolerance (same as simple example)

  4. Runs the pipeline: Repairs using DTP with path-based predictions, then validates against demand invariants

  5. Shows results: Demonstrates excellent performance - 98%+ validation pass rate and 40%+ error reduction

Why Results Are Better Than Simple Example

The Abilene example shows superior performance compared to the simple 3-node example:

  • Higher repair confidence (0.81 vs 0.50): More nodes provide more redundant measurements for voting
  • Higher validation confidence (0.99 vs 0.75): Larger network creates more flow conservation constraints
  • Better error reduction (41% vs 38%): More data points improve statistical repair accuracy

This demonstrates that CrossCheck scales effectively to real-world networks.

About the Abilene Network

The Abilene network was a high-performance backbone network serving the US research and education community. The dataset includes:

  • 12 nodes: Major US cities (Atlanta, Chicago, Denver, Houston, etc.)
  • 15 links: Bidirectional connections with real geographic distances
  • 132 demands: Traffic between all node pairs
  • Time period: March-September 2004 (subset uses first 100 hours)

This is a well-known dataset used in network research and provides realistic traffic patterns and network structure.