ROX-33737: Create dashboards (Phase 0) by ebensh · Pull Request #19604 · stackrox/stackrox

ebensh · 2026-03-25T14:46:14Z

No description provided.

… work)

Implements core Grafana dashboard generation using Go structs: - Dashboard struct with UID, Title, Tags, Links, Rows, Templating - Panel types: timeseries, stat, gauge, text, table - Automatic gridPos calculation and panel ID assignment - Gap annotation support for missing metrics - Threshold configuration for gauges - Uses datasource UID PBFA97CFB590B2093 (Prometheus) All tests passing (11 test cases). User request: Create a Go package that generates Grafana dashboard JSON from Go structs to avoid hand-writing verbose JSON and make dashboards maintainable and testable. Note: Code partially generated by AI (Claude Sonnet 4.5)

Creates generate/main.go with: - writeJSON helper for writing dashboard JSON files - main() with configurable output directory - Placeholder for dashboard generation (to be added in Tasks 2-6) CLI builds successfully and is ready for dashboard definitions. User request: Create CLI entrypoint that imports dashboard definitions and writes JSON files. Note: Code partially generated by AI (Claude Sonnet 4.5)

Implements Task 2: Level 1 — Service Map Dashboard (StackRox Overview) This commit adds the top-level overview dashboard for the StackRox monitoring hierarchy. The dashboard provides a service map view with three main sections: - Service Health: Central process metrics (up status, CPU, memory, goroutines, version) - Connected Sensors: Cluster connectivity metrics (sensors, clusters, nodes, vCPUs) - Database: PostgreSQL health metrics (connection status, size, active connections, available space) The dashboard includes a drill-down link to the Level 2 'Central Internals' dashboard for detailed component analysis. Implementation follows test-driven development methodology with comprehensive test coverage verifying dashboard structure, panel configuration, and JSON validity. Files: - deploy/charts/monitoring/dashboards/generator/l1_overview.go: Dashboard definition - deploy/charts/monitoring/dashboards/generator/l1_overview_test.go: Test suite - deploy/charts/monitoring/dashboards/generator/generate/main.go: CLI integration User request: Implement Task 2: Level 1 — Service Map Dashboard following TDD methodology. Dashboard uses existing metrics from Central service (standard Go runtime metrics and custom StackRox metrics for sensor connectivity and database health). Note: Code partially generated with AI assistance (Claude Sonnet 4.5).

Implements the Level 2 dashboard that displays a grid view of all 10 logical regions within Central. Each region includes headline metrics and a link to its corresponding Level 3 detail dashboard. Dashboard structure: - UID: central-internals - 10 rows (one per logical region): Sensor Ingestion, Deployment Processing, Vulnerability Enrichment, Detection & Alerts, Risk Calculation, Background Reprocessing, Pruning & GC, Network Analysis, Report Generation, API & UI - Each row contains 3-4 metric panels plus a details link to Level 3 - Back-link to Level 1 StackRox Overview dashboard Notable features: - Identifies existing metrics gaps with text panels (e.g., no alert generation rate metric, no Central-side report generation metrics) - Uses histogram_quantile for p95 latency calculations - Uses rate() for counter metrics to show per-second rates - Follows StackRox dashboard generator patterns and testing conventions Testing: - Comprehensive test coverage in l2_central_test.go - All 16 tests pass - Validates metadata, row structure, panel queries, and JSON generation CLI integration: - Added L2 dashboard generation to generate/main.go - Outputs central-internals.json alongside stackrox-overview.json Implemented using TDD methodology: tests written first, then implementation, following Go best practices and StackRox coding style. User request: Implement Task 3: Level 2 — Central Internals Dashboard Code partially generated with AI assistance (Claude Code).

Implement the highest-value Level 3 detail dashboard for Central's Sensor Ingestion pipeline with comprehensive metrics breakdowns. Dashboard structure (central-sensor-ingestion): - Connection Status: sensor connectivity and state transitions - Deduper: throughput, hit rate, hash store size, and operations - Worker Queue: event processing, duration, with gap annotations for missing queue depth and in-flight metrics - Pipeline Processing: resource processing, panics, K8s event latency, with gap annotation for per-fragment metrics - Messages Not Sent: failed sends to sensor with type and reason Key features: - 3 gap annotation panels identify missing observability metrics - Full metric breakdowns with proper legend formats - Links back to Level 2 Central Internals dashboard - Follows TDD: comprehensive test coverage for all rows and panels Generated files: - deploy/charts/monitoring/dashboards/central-sensor-ingestion.json Task: ROX-33737 Phase 0 - Dashboard Hierarchy Prototype Partially AI-assisted.

Implements Task 5 from the Central metrics dashboard project. This dashboard provides deep visibility into the vulnerability enrichment pipeline with comprehensive metrics for: - Scan semaphore utilization and queue management - Image scanning performance (p50/p95/p99 latencies) - Node scanning metrics - Image deduplication statistics - Registry client requests, latency, and timeout tracking Includes 2 gap annotations identifying missing metrics: - Enrichment request counter (for failure rate calculation) - Node scan counter Test-driven development: comprehensive test suite covering all rows, panels, queries, and JSON generation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Implements Task 6: Create stub dashboards for the 8 remaining logical regions in Central. Each dashboard has real panels where metrics exist and prominent gap annotations documenting missing metrics. Dashboard UIDs: - central-deployment-processing: Deployment/pod/namespace resource processing - central-detection-alerts: Detection engine and alert generation - central-risk-calculation: Risk scoring and reprocessing - central-background-reprocessing: Background loops and batch processing - central-pruning-gc: Process pruning and garbage collection - central-network-analysis: Network flows and endpoint tracking - central-report-generation: Compliance operator and vuln reports - central-api-ui: GraphQL, gRPC, and API endpoint metrics All dashboards: - Include level-3 tag and back-link to central-internals dashboard - Mix real metric panels (timeseries/stat) with gap annotation panels - Document needed metrics for complete observability Implementation follows TDD methodology: 1. Wrote comprehensive tests first (l3_stubs_test.go) 2. Implemented L3Stubs() function with 8 dashboard specs 3. Integrated into CLI generator (generate/main.go) 4. Verified all tests pass and JSON output is valid Files: - Created: deploy/charts/monitoring/dashboards/generator/l3_stubs.go - Created: deploy/charts/monitoring/dashboards/generator/l3_stubs_test.go - Modified: deploy/charts/monitoring/dashboards/generator/generate/main.go Test results: All 77 tests pass. AI-assisted implementation based on user specification.

…oards)

openshift-ci · 2026-03-25T14:46:20Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

openshift-ci · 2026-03-25T14:46:25Z

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ebensh and others added 10 commits March 25, 2026 15:05

WIP: brainstorm artifacts and migrator change (unrelated to dashboard…

19e10c2

… work)

feat(monitoring): register new dashboards in Helm chart

8b67b78

feat(monitoring): generate all Phase 0 dashboard JSON files (12 dashb…

4db38dd

…oards)

openshift-ci bot added the do-not-merge/work-in-progress label Mar 25, 2026

openshift-ci bot added the needs-rebase label Mar 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROX-33737: Create dashboards (Phase 0)#19604

ROX-33737: Create dashboards (Phase 0)#19604
ebensh wants to merge 10 commits intomasterfrom
rox-33737-phase0-dashboards

ebensh commented Mar 25, 2026

Uh oh!

openshift-ci bot commented Mar 25, 2026

Uh oh!

openshift-ci bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ebensh commented Mar 25, 2026

Uh oh!

openshift-ci bot commented Mar 25, 2026

Uh oh!

openshift-ci bot commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant