feat: scoped and global custom metric trackers by parametalol · Pull Request #18302 · stackrox/stackrox

parametalol · 2025-12-18T21:37:50Z

Description

Not all custom metrics need to be scoped (create a separate registry for each scrape user ID). This PR adds support for global trackers and makes a couple of existing trackers as such.

User-facing documentation

CHANGELOG.md is updated OR update is not needed
documentation PR is created and is linked above OR is not needed

Testing and quality

the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
CI results are inspected

Automated testing

How I validated my change

CI. Manual.

Current dependencies on/for this PR:

master
- PR feat: scoped and global custom metric trackers #18302 👈
  - PR feat(idea): api_counter metric #18080
    - PR feat(ui): Prometheus metrics viewer #18285

openshift-ci · 2025-12-18T21:37:54Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

rhacs-bot · 2025-12-18T22:12:31Z

Images are ready for the commit at e6ac4e1.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-18-ge6ac4e168d.

codecov · 2025-12-18T22:18:01Z

Codecov Report

❌ Patch coverage is 84.44444% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.94%. Comparing base (d20c8fc) to head (58a6dae).
⚠️ Report is 104 commits behind head on master.

Files with missing lines	Patch %	Lines
central/metrics/custom/runner.go	73.33%	3 Missing and 1 partial ⚠️
central/metrics/custom/tracker/tracker_base.go	85.00%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #18302      +/-   ##
==========================================
+ Coverage   48.92%   48.94%   +0.01%     
==========================================
  Files        2619     2622       +3     
  Lines      197514   197934     +420     
==========================================
+ Hits        96631    96869     +238     
- Misses      93504    93677     +173     
- Partials     7379     7388       +9

Flag	Coverage Δ
go-unit-tests	`48.94% <84.44%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

parametalol · 2025-12-23T14:36:44Z

central/metrics/custom_registry.go


 var (
 	userRegistries map[string]*customRegistry = make(map[string]*customRegistry)
+	globalRegistry *customRegistry


Note to reviewers:

I was hesitating between a dedicated globalRegistry and just keeping a record in the userRegistries map under the globalScopeID identity key (empty string).

Claude suggested to keep it separate for clarity.

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The new tests in Test_scope compare full metric outputs as raw strings, which can be brittle due to potential non-deterministic ordering from Prometheus; consider parsing the metrics or comparing via structured data (e.g., decoded metric families) instead of exact string equality.
In trackerRunner.ServeHTTP, you construct a new promhttp.HandlerFor on every request; you could improve efficiency by creating a reusable handler (or at least a reusable prometheus.Gatherers wrapper) and only varying the underlying registries, as those are already long-lived.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new tests in `Test_scope` compare full metric outputs as raw strings, which can be brittle due to potential non-deterministic ordering from Prometheus; consider parsing the metrics or comparing via structured data (e.g., decoded metric families) instead of exact string equality.
- In `trackerRunner.ServeHTTP`, you construct a new `promhttp.HandlerFor` on every request; you could improve efficiency by creating a reusable handler (or at least a reusable `prometheus.Gatherers` wrapper) and only varying the underlying registries, as those are already long-lived.

## Individual Comments

### Comment 1
<location> `central/metrics/custom/tracker/tracker_base.go:115` </location>
<code_context>
+func makeTrackerBase[F Finding](metricPrefix, description string, scoped bool,
+	getters LazyLabelGetters[F], generator FindingGenerator[F],
+) *TrackerBase[F] {
+	registryFactory := globalRegistryFactory
+	if scoped {
+		registryFactory = metrics.GetCustomRegistry
</code_context>

<issue_to_address>
**issue (complexity):** Consider replacing the scoped flag and magic global ID with an explicit scope ID provider strategy injected via constructors to clarify tracker behavior and simplify Gather.

You can keep the new functionality while making the semantics more explicit and removing the boolean + magic ID coupling by turning “scope” into an explicit strategy on the tracker.

### 1. Replace `scoped` + `globalScopeID` with an explicit scope ID provider

Add a function field that defines how to get the scope ID, instead of storing a boolean and using `""` as a magic value:

```go
type TrackerBase[F Finding] struct {
    metricPrefix string
    description  string
    getters      LazyLabelGetters[F]
    generator    FindingGenerator[F]

    config           *Configuration
    metricsConfigMux sync.RWMutex

    gatherers sync.Map
    cleanupWG sync.WaitGroup

    registryFactory func(userID string) (metrics.CustomRegistry, error)
    scopeIDProvider func(ctx context.Context) (string, bool) // <— new
}
```

Then `Gather` becomes simpler and self-describing:

```go
func (tracker *TrackerBase[Finding]) Gather(ctx context.Context) {
    id, ok := tracker.scopeIDProvider(ctx)
    if !ok {
        return
    }

    cfg := tracker.getConfiguration()
    if cfg == nil {
        return
    }

    gatherer := tracker.getGatherer(id, cfg)
    if gatherer == nil {
        return
    }
    defer tracker.cleanupInactiveGatherers()
    defer gatherer.running.Store(false)

    if cfg.period == 0 || time.Since(gatherer.lastGather) < cfg.period {
        return
    }

    // ...
}
```

This removes `scoped` and the need for `globalScopeID` in `Gather`, and makes the “auth required or not” decision explicit via the `scopeIDProvider`.

### 2. Pass behavior directly from constructors instead of via `scoped` flag

You can still share the common constructor logic, but without a boolean that needs to be re-interpreted later:

```go
func MakeTrackerBase[F Finding](metricPrefix, description string,
    getters LazyLabelGetters[F], generator FindingGenerator[F],
) *TrackerBase[F] {
    return makeTrackerBase(
        metricPrefix,
        description,
        metrics.GetCustomRegistry,
        func(ctx context.Context) (string, bool) {
            userID, err := authn.IdentityFromContext(ctx)
            if err != nil {
                return "", false
            }
            return userID.UID(), true
        },
        getters,
        generator,
    )
}

func MakeGlobalTrackerBase[F Finding](metricPrefix, description string,
    getters LazyLabelGetters[F], generator FindingGenerator[F],
) *TrackerBase[F] {
    return makeTrackerBase(
        metricPrefix,
        description,
        globalRegistryFactory,
        func(ctx context.Context) (string, bool) {
            return "global", true // or keep "" if you prefer, but it's now encapsulated
        },
        getters,
        generator,
    )
}

func makeTrackerBase[F Finding](
    metricPrefix, description string,
    registryFactory func(string) (metrics.CustomRegistry, error),
    scopeIDProvider func(context.Context) (string, bool),
    getters LazyLabelGetters[F],
    generator FindingGenerator[F],
) *TrackerBase[F] {
    return &TrackerBase[F]{
        metricPrefix:    metricPrefix,
        description:     description,
        getters:         getters,
        generator:       generator,
        registryFactory: registryFactory,
        scopeIDProvider: scopeIDProvider,
    }
}
```

This keeps:

- Both public constructors (`MakeTrackerBase` and `MakeGlobalTrackerBase`)
- The global vs. scoped registry behavior
- The global vs. user-scoped `Gather` behavior

But removes:

- The `scoped` field
- The `globalScopeID` sentinel from control flow
- The boolean → factory selection → later reinterpretation chain

The resulting code makes the variant behavior explicit and localized to construction, while `Gather` just uses the injected strategies.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

parametalol · 2026-01-13T10:39:37Z

The new tests in Test_scope compare full metric outputs as raw strings, which can be brittle due to potential non-deterministic ordering from Prometheus; consider parsing the metrics or comparing via structured data (e.g., decoded metric families) instead of exact string equality.

Comparing strings is more visual, and I didn't see any variations in Prometheus output in the tests.

In trackerRunner.ServeHTTP, you construct a new promhttp.HandlerFor on every request; you could improve efficiency by creating a reusable handler (or at least a reusable prometheus.Gatherers wrapper) and only varying the underlying registries, as those are already long-lived.

A cache could indeed be introduced to cache handlers per user ID. But that looks like an overkill.

stehessel · 2026-02-03T13:17:40Z

central/metrics/custom/tracker/tracker_base.go

+	return makeTrackerBase(metricPrefix, description, true, getters, generator)
+}
+
+// MakeGlobalTrackerBase creates a global, i.e. non-scoped tracker.


Could you please add here a comment that explains the reason that global trackers exist (more or less first sentence from the PR description)?

openshift-ci bot added the do-not-merge/work-in-progress label Dec 18, 2025

github-actions bot added the area/central label Dec 18, 2025

parametalol commented Dec 23, 2025

View reviewed changes

parametalol marked this pull request as ready for review December 23, 2025 14:37

openshift-ci bot removed the do-not-merge/work-in-progress label Dec 23, 2025

parametalol requested a review from stehessel January 13, 2026 09:25

parametalol added ai-review ai-assisted labels Jan 13, 2026

sourcery-ai bot reviewed Jan 13, 2026

View reviewed changes

parametalol added 4 commits February 2, 2026 17:20

scoped vs global trackers

8f77091

nits

d0c056e

scoped and global access tests

95b6404

style

e6ac4e1

parametalol force-pushed the michael/scoped-and-global-trackers branch from 58a6dae to e6ac4e1 Compare February 2, 2026 16:47

stehessel approved these changes Feb 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: scoped and global custom metric trackers#18302

feat: scoped and global custom metric trackers#18302
parametalol wants to merge 4 commits intomasterfrom
michael/scoped-and-global-trackers

parametalol commented Dec 18, 2025 •

edited

Loading

Uh oh!

openshift-ci bot commented Dec 18, 2025

Uh oh!

rhacs-bot commented Dec 18, 2025 •

edited

Loading

Uh oh!

codecov bot commented Dec 18, 2025 •

edited

Loading

Uh oh!

parametalol Dec 23, 2025

Uh oh!

sourcery-ai bot left a comment

Uh oh!

parametalol commented Jan 13, 2026

Uh oh!

stehessel Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

parametalol commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

User-facing documentation

Testing and quality

Automated testing

How I validated my change

Uh oh!

openshift-ci bot commented Dec 18, 2025

Uh oh!

rhacs-bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

parametalol Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

parametalol commented Jan 13, 2026

Uh oh!

stehessel Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

parametalol commented Dec 18, 2025 •

edited

Loading

rhacs-bot commented Dec 18, 2025 •

edited

Loading

codecov bot commented Dec 18, 2025 •

edited

Loading