Rivellum

Rivellum Portal

Checking...
testnet

Performance Benchmarks

Comprehensive guide to performance benchmarking and regression testing

Overview

Rivellum includes a comprehensive benchmarking suite for tracking performance metrics and detecting regressions. The benchmark framework measures throughput (TPS), latency percentiles, PoUW operations, and ZK proof overhead across four canonical scenarios.

Quick Start

Running Benchmarks

# Run all scenarios
cargo run --release -p rivellum-bench run --scenario all

# Run a specific scenario
cargo run --release -p rivellum-bench run --scenario solo-transfers

# CI smoke mode (reduced load)
cargo run --release -p rivellum-bench run --ci-smoke

Comparing Against Baseline

# Compare latest results against baseline
cargo run --release -p rivellum-bench compare \
  --baseline bench-baselines/baseline.json \
  --current bench-results/latest.json \
  --fail-on-regression

Exporting Results

# Export to HTML
cargo run --release -p rivellum-bench export \
  --input bench-results/latest.json \
  --output reports/benchmark.html \
  --format html

# Export to CSV
cargo run --release -p rivellum-bench export \
  --input bench-results/latest.json \
  --output reports/benchmark.csv \
  --format csv

Benchmark Scenarios

1. Solo Transfers

Scenario ID: solo-transfers

Pure transfer transactions with no contracts or PoUW. Measures baseline transaction processing performance.

Expected Performance:

  • TPS: ~500
  • P95 Latency: ~25ms

2. Mixed Contracts

Scenario ID: mixed-contracts

50% transfers, 50% simple contract calls. Measures performance under mixed workload.

Expected Performance:

  • TPS: ~350
  • P95 Latency: ~38ms

3. PoUW Heavy

Scenario ID: pouw-heavy

Transactions with Proof-of-Useful-Work challenges enabled. Measures PoUW verification overhead.

Expected Performance:

  • TPS: ~200
  • P95 Latency: ~65ms
  • PoUW Ops/s: ~100

4. ZK Enabled

Scenario ID: zk-enabled

Transfers with ZK privacy proofs enabled. Measures ZK proof generation and verification overhead.

Expected Performance:

  • TPS: ~150
  • P95 Latency: ~95ms
  • ZK Overhead: ~25%

Micro-Benchmarks

Criterion-based micro-benchmarks are available for low-level operations:

# Run all micro-benchmarks
cargo bench

# Run specific benchmark suite
cargo bench --bench crypto_bench
cargo bench --bench intent_bench
cargo bench --bench execution_bench

Available Micro-Benchmarks

crypto_bench:

  • Keypair generation
  • Signature creation
  • Signature verification
  • Address generation
  • State root hashing (10, 100, 1000 items)

intent_bench:

  • Intent parsing
  • Intent serialization
  • Intent validation
  • Batch intent parsing (10, 50, 100 intents)

execution_bench:

  • Transfer execution
  • Execution trace generation
  • Batch execution (10, 50, 100 transactions)
  • Mock ZK proof generation

Baseline Management

Creating a New Baseline

When performance improvements are validated and merged, update the baseline:

# Run benchmarks
cargo run --release -p rivellum-bench run \
  --scenario all \
  --output bench-results/new-baseline.json

# Review results
cargo run --release -p rivellum-bench export \
  --input bench-results/new-baseline.json \
  --output reports/review.html \
  --format html

# Replace baseline (after review)
cp bench-results/new-baseline.json bench-baselines/baseline.json
git add bench-baselines/baseline.json
git commit -m "chore: update performance baseline"

Baseline Update Guidelines

Update baselines when:

  • Intentional performance optimizations are merged
  • Infrastructure changes affect baseline performance
  • New hardware configurations are adopted

DO NOT update baselines to hide regressions.

Regression Detection

Thresholds

Default regression thresholds:

  • TPS Decrease: -10% (fail if TPS drops more than 10%)
  • P95 Latency Increase: +15% (fail if P95 increases more than 15%)

Custom Thresholds

cargo run --release -p rivellum-bench compare \
  --baseline bench-baselines/baseline.json \
  --current bench-results/latest.json \
  --tps-threshold 5.0 \
  --p95-threshold 10.0 \
  --fail-on-regression

CI Integration

Benchmarks run automatically in CI with --ci-smoke mode (reduced load):

# .github/workflows/ci.yml
- name: Run benchmark smoke tests
  run: |
    ./target/release/rivellum-bench run --ci-smoke --output bench-results/ci-run.json

- name: Compare against baselines
  run: |
    ./target/release/rivellum-bench compare \
      --baseline bench-baselines/baseline.json \
      --current bench-results/ci-run.json \
      --fail-on-regression

Performance Dashboard

View real-time benchmark results in the Portal:

http://localhost:3001/performance

The dashboard displays:

  • Current vs. baseline TPS and latency metrics
  • Latency percentile charts (P50, P95, P99)
  • PoUW and ZK overhead metrics
  • Regression indicators

Architecture

Macro-Benchmark Flow

  1. Scenario Selection: Choose from registry or run all
  2. Load Generation: Submit transactions via HTTP to running node
  3. Metrics Collection: Fetch /metrics endpoint for node statistics
  4. Result Calculation: Compute TPS, latency percentiles, overhead
  5. JSON Output: Save structured results for comparison

Micro-Benchmark Flow

  1. Criterion Setup: Configure benchmark groups and parameters
  2. Warm-up: Run iterations to stabilize performance
  3. Measurement: Collect timing samples
  4. Analysis: Statistical analysis with outlier detection
  5. HTML Report: Generate detailed criterion reports

File Structure

rivellum/
ā”œā”€ā”€ crates/
│   └── rivellum-bench/
│       ā”œā”€ā”€ src/
│       │   ā”œā”€ā”€ lib.rs          # Core types and traits
│       │   ā”œā”€ā”€ main.rs         # CLI entry point
│       │   ā”œā”€ā”€ scenarios/      # Macro-benchmark scenarios
│       │   ā”œā”€ā”€ metrics.rs      # Metrics collection
│       │   ā”œā”€ā”€ compare.rs      # Regression detection
│       │   └── export.rs       # Result export (CSV/HTML)
│       └── benches/            # Criterion micro-benchmarks
ā”œā”€ā”€ bench-baselines/
│   └── baseline.json           # Reference baseline
└── bench-results/
    └── latest.json             # Most recent run

Best Practices

Running Benchmarks

  1. Clean Environment: Close unnecessary applications
  2. Consistent Hardware: Use same machine for comparisons
  3. Warm-up: Allow node to stabilize before benchmarking
  4. Multiple Runs: Run 3-5 times and average results for important measurements

Interpreting Results

  • TPS: Higher is better (more transactions per second)
  • Latency: Lower is better (faster response times)
  • P95/P99: Focus on tail latencies for user experience
  • Overhead: Measure cost of features (PoUW, ZK)

Debugging Regressions

If CI detects a regression:

  1. Review the comparison output in CI artifacts
  2. Run benchmarks locally to reproduce
  3. Use git bisect to find the offending commit
  4. Profile the regressed code path
  5. Fix or justify the regression

Command Reference

rivellum-bench CLI

rivellum-bench 0.1.0
Rivellum performance benchmark tool

USAGE:
    rivellum-bench <SUBCOMMAND>

SUBCOMMANDS:
    run        Run benchmark scenarios
    compare    Compare results against baseline
    export     Export results to CSV or HTML
    list       List available scenarios
    help       Print this message or the help of the given subcommand(s)

Run Options

rivellum-bench-run
Run benchmark scenarios

USAGE:
    rivellum-bench run [OPTIONS]

OPTIONS:
    -s, --scenario <SCENARIO>        Scenario to run [default: all]
    -n, --node-url <NODE_URL>        Node URL [default: http://localhost:8080]
    -t, --tx-count <TX_COUNT>        Number of transactions [default: 1000]
    -c, --concurrency <CONCURRENCY>  Concurrent load generators [default: 10]
    -o, --output <OUTPUT>            Output file [default: bench-results/latest.json]
        --ci-smoke                   Enable CI smoke mode (reduced load)

Compare Options

rivellum-bench-compare
Compare results against baseline

USAGE:
    rivellum-bench compare [OPTIONS]

OPTIONS:
    -b, --baseline <BASELINE>          Baseline file [default: bench-baselines/baseline.json]
    -c, --current <CURRENT>            Current results file [default: bench-results/latest.json]
        --tps-threshold <TPS>          TPS regression threshold % [default: 10.0]
        --p95-threshold <P95>          P95 regression threshold % [default: 15.0]
        --fail-on-regression           Exit with error if regressions detected

Troubleshooting

Node Not Running

Error: Failed to fetch metrics: connection refused

Solution: Start a node before running benchmarks:

cargo run --release -p rivellum-node -- --config config/test-config.toml

Low TPS Results

Possible causes:

  • Node running in debug mode (use --release)
  • Insufficient hardware resources
  • Network latency (use localhost)
  • Concurrent processes consuming CPU

Baseline File Missing

Error: Failed to load baseline results: No such file or directory

Solution: Create initial baseline:

cargo run --release -p rivellum-bench run --scenario all --output bench-baselines/baseline.json

Future Enhancements

Planned improvements:

  • Multi-node cluster benchmarks
  • Continuous performance tracking dashboard
  • Flamegraph integration for profiling
  • Historical trend analysis
  • Automated baseline updates on main branch
  • Real ZK proof benchmarks (vs. mock)