Skip to content

ROX-33256: Allow caller to provide requested token audience#19304

Draft
rhybrillou wants to merge 3 commits intomasterfrom
master-yann/ROX-33256/internal-tokens-with-custom-audience
Draft

ROX-33256: Allow caller to provide requested token audience#19304
rhybrillou wants to merge 3 commits intomasterfrom
master-yann/ROX-33256/internal-tokens-with-custom-audience

Conversation

@rhybrillou
Copy link
Contributor

@rhybrillou rhybrillou commented Mar 5, 2026

Description

This lets the caller of the internal token generation service (currently sensor for the OCP plugin) specify the audience it wants for the generated token, in order to better filter audit log events.
If no audience is provided, the service defaults to the https://stackrox.io/jwt-sources#internal-rox-tokens value.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests

How I validated my change

Manual run of unit tests

Manual test:

  • deploy a demo cluster with central using the image built for the pull request
  • use a fake sensor client to request a token with a token audience
    Note: the policy validation had to be deactivated in order to get such a token
  • inspect the token content
{
  ...
  "aud": "Custom token audience",
  ...
}
  • use a fake sensor client to request a token without token audience
  • inspect the token content
{
  ...
  "aud": "https://stackrox.io/jwt-sources#internal-rox-tokens",
  ...
}
  • test both issued tokens to retrieve data, and ensure the returned data is limited to the requested scope
    Tested on an OpenShift Demo cluster, scoping to backend and medical. The expected 7 deployments were returned for both tokens.

@openshift-ci
Copy link

openshift-ci bot commented Mar 5, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • The serviceImpl struct still has an issuer field and some tests still pass an issuer into createService, but token issuance now goes through issuerManager; consider removing the unused field/parameter to reduce confusion and keep the API minimal.
  • The TestSomething test in issuer_manager_test.go is a no-op sanity check and doesn’t exercise any behavior; consider removing it or replacing it with a focused test that validates a specific aspect of issuerManager.
  • In issuerManager.Start, repeated calls will overwrite m.ticker without resetting the stopper or guarding against multiple goroutines; if Start can be called more than once, add protection (e.g., a sync.Once or an internal started flag) to avoid leaks or inconsistent state.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `serviceImpl` struct still has an `issuer` field and some tests still pass an `issuer` into `createService`, but token issuance now goes through `issuerManager`; consider removing the unused field/parameter to reduce confusion and keep the API minimal.
- The `TestSomething` test in `issuer_manager_test.go` is a no-op sanity check and doesn’t exercise any behavior; consider removing it or replacing it with a focused test that validates a specific aspect of `issuerManager`.
- In `issuerManager.Start`, repeated calls will overwrite `m.ticker` without resetting the stopper or guarding against multiple goroutines; if `Start` can be called more than once, add protection (e.g., a `sync.Once` or an internal started flag) to avoid leaks or inconsistent state.

## Individual Comments

### Comment 1
<location path="central/auth/internaltokens/service/service_impl_test.go" line_range="192-193" />
<code_context>
 		mockRoleStore := roleDataStoreMocks.NewMockDataStore(mockCtrl)
 		mockIssuer := tokensMocks.NewMockIssuer(mockCtrl)
-		svc := createService(it, mockIssuer, mockClusterStore, mockRoleStore, permissivePolicy)
+		mockIssuerFactory := tokensMocks.NewMockIssuerFactory(mockCtrl)
+		mockIssuerFactory.EXPECT().CreateIssuer(gomock.Any(), gomock.Any()).Times(1).Return(mockIssuer, nil)
+		svc := createService(it, mockIssuer, mockIssuerFactory, mockClusterStore, mockRoleStore, permissivePolicy)
+		defer svc.issuerManager.Stop()
</code_context>
<issue_to_address>
**suggestion (testing):** Add explicit tests for TargetAudience behavior (default and custom) in token generation

The updated GenerateTokenForPermissionsAndScope now uses req.TargetAudience to choose an issuer (defaulting to internalTokenId), but the tests still use gomock.Any() for the Source argument, so this behavior isn’t asserted.

Please add at least two subtests that verify the Source passed to CreateIssuer:
1) When TargetAudience is empty, Source.ID() == internalTokenId.
2) When TargetAudience is non-empty, Source.ID() == that audience.

You can implement this with a custom gomock.Matcher that inspects tokens.Source.ID(), so the audience-selection behavior is validated at the service level, not only in issuer_manager tests.
</issue_to_address>

### Comment 2
<location path="central/auth/internaltokens/service/issuer_manager_test.go" line_range="21-22" />
<code_context>
+	fakeAudience = "fake audience"
+)
+
+func TestSomething(t *testing.T) {
+	assert.True(t, true)
+}
+
</code_context>
<issue_to_address>
**nitpick (testing):** Remove or replace the trivial TestSomething with a meaningful test

This test only asserts that `true` is `true` and doesn’t exercise `issuerManager` or any related behavior, so it adds noise without meaningful coverage.

Please either delete this test or replace it with one that verifies a real behavior not already covered (e.g., handling multiple Start/Stop calls or behavior when `purgeInterval` is zero/very small).
</issue_to_address>

### Comment 3
<location path="central/auth/internaltokens/service/issuer_manager_test.go" line_range="49-62" />
<code_context>
+}
+
+func TestAddIssuerToCache(t *testing.T) {
+	t.Run("error from issuer factory is propagated", func(it *testing.T) {
+		mockCtrl := gomock.NewController(it)
+		mockIssuerFactory := tokenMocks.NewMockIssuerFactory(mockCtrl)
+		mgr := newIssuerManager(mockIssuerFactory, testPurgeDelay)
+		mockIssuerFactory.EXPECT().
+			CreateIssuer(gomock.Any(), gomock.Any()).
+			Times(1).
+			Return(nil, errDummy)
+		issuer, err := mgr.addIssuerToCache(testAudience, testTokenExpiry)
+		assert.Nil(it, issuer)
+		assert.ErrorIs(it, err, errDummy)
+	})
+	t.Run("cache miss creates a new issuer and stores it in cache", func(it *testing.T) {
</code_context>
<issue_to_address>
**suggestion (testing):** Extend error-path test to assert cache state remains unchanged when issuer creation fails

This subtest checks that addIssuerToCache returns the expected error, but not that the cache remains unchanged on failure. Since addIssuerToCache mutates the cache, it’s important to assert that no entries are written when CreateIssuer fails.

After the call to addIssuerToCache, add an assertion that the cache is unchanged, e.g.:

```go
assert.Empty(it, mgr.cache)
```

(or equivalently assert that the testAudience key is absent). This strengthens the error-path contract and guards against future regressions where the cache might be modified before returning an error.

```suggestion
func TestAddIssuerToCache(t *testing.T) {
	t.Run("error from issuer factory is propagated", func(it *testing.T) {
		mockCtrl := gomock.NewController(it)
		mockIssuerFactory := tokenMocks.NewMockIssuerFactory(mockCtrl)
		mgr := newIssuerManager(mockIssuerFactory, testPurgeDelay)
		mockIssuerFactory.EXPECT().
			CreateIssuer(gomock.Any(), gomock.Any()).
			Times(1).
			Return(nil, errDummy)
		issuer, err := mgr.addIssuerToCache(testAudience, testTokenExpiry)
		assert.Nil(it, issuer)
		assert.ErrorIs(it, err, errDummy)
		assert.Empty(it, mgr.cache)
	})
	t.Run("cache miss creates a new issuer and stores it in cache", func(it *testing.T) {
```
</issue_to_address>

### Comment 4
<location path="central/auth/internaltokens/service/issuer_manager_test.go" line_range="144-146" />
<code_context>
+		assert.Len(it, mgr.cache, 1)
+		validateCacheEntry(it, mgr, testAudience, true, mockIssuer, notExpired)
+	})
+	t.Run("Purge on a cache with multiple expired items removes the expired items", func(it *testing.T) {
+		mockCtrl := gomock.NewController(it)
+		mockIssuerFactory := tokenMocks.NewMockIssuerFactory(mockCtrl)
+		mockIssuer := tokenMocks.NewMockIssuer(mockCtrl)
+		mgr := newIssuerManager(mockIssuerFactory, testPurgeDelay)
+		old := testClock().Add(-1 * time.Minute)
+		veryOld := testClock().Add(-5 * time.Minute)
+		addIssuerToManager(mgr, fakeAudience, nil, mockIssuer, old)
+		addIssuerToManager(mgr, testAudience, nil, mockIssuer, veryOld)
+		assert.Len(it, mgr.cache, 2)
+		validateCacheEntry(it, mgr, fakeAudience, true, mockIssuer, old)
+		validateCacheEntry(it, mgr, testAudience, true, mockIssuer, veryOld)
+		mockIssuerFactory.EXPECT().UnregisterSource(gomock.Any()).Times(2).Return(nil)
+
+		mgr.purgeExpired(testClockTime)
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for purgeExpired behavior when UnregisterSource returns an error

Current tests only exercise the successful `UnregisterSource` path. Please add a subtest that configures the mock `IssuerFactory` to return a non-nil error from `UnregisterSource`, calls `purgeExpired`, and asserts that the expired entries are still removed from `mgr.cache`. This verifies that cleanup continues even when unregistering a source fails, matching the current implementation behavior.

```suggestion
		assert.Empty(it, mgr.cache)
	})
	t.Run("Purge on a cache with expired items continues cleanup when UnregisterSource fails", func(it *testing.T) {
		mockCtrl := gomock.NewController(it)
		mockIssuerFactory := tokenMocks.NewMockIssuerFactory(mockCtrl)
		mockIssuer := tokenMocks.NewMockIssuer(mockCtrl)
		mgr := newIssuerManager(mockIssuerFactory, testPurgeDelay)
		old := testClock().Add(-1 * time.Minute)
		veryOld := testClock().Add(-5 * time.Minute)
		addIssuerToManager(mgr, fakeAudience, nil, mockIssuer, old)
		addIssuerToManager(mgr, testAudience, nil, mockIssuer, veryOld)
		assert.Len(it, mgr.cache, 2)
		validateCacheEntry(it, mgr, fakeAudience, true, mockIssuer, old)
		validateCacheEntry(it, mgr, testAudience, true, mockIssuer, veryOld)
		mockIssuerFactory.EXPECT().UnregisterSource(gomock.Any()).Times(2).Return(assert.AnError)

		mgr.purgeExpired(testClockTime)
		assert.Empty(it, mgr.cache)
	})
	t.Run("Purge on a cache with a single non-expired item does not touch anything", func(it *testing.T) {
```
</issue_to_address>

### Comment 5
<location path="central/auth/internaltokens/service/issuer_manager.go" line_range="73" />
<code_context>
+	_ = m.stopper.Client().Stopped().Wait()
+}
+
+func (m *issuerManager) getIssuer(audience string, expiresAt time.Time) (tokens.Issuer, error) {
+	issuer, found := m.getIssuerFromCache(audience, expiresAt)
+	if found {
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the cache access, ticker lifecycle, and purge logic to avoid duplicated paths and unnecessary state while keeping behavior unchanged.

You can reduce complexity without changing behavior with a couple of small, focused refactors.

### 1. Simplify `getIssuer` cache access and locking

You don’t need the double-checked pattern via `getIssuerFromCache``addIssuerToCache`. Holding the lock once and doing a single lookup/update-or-create makes the flow easier to reason about and removes duplicated logic for `expiresAt`.

```go
func (m *issuerManager) getIssuer(audience string, expiresAt time.Time) (tokens.Issuer, error) {
    m.cacheMutex.Lock()
    defer m.cacheMutex.Unlock()

    // Single lookup
    if iw, found := m.cache[audience]; found {
        if expiresAt.After(iw.expiresAt) {
            iw.expiresAt = expiresAt
        }
        return iw.issuer, nil
    }

    // Creation path stays the same, just moved here
    name := fmt.Sprintf("internal token source for %s", audience)
    issuerSource := tokenbased.NewTokenAuthProvider(
        audience,
        name,
        internalToken,
        tokenbased.WithRevocationLayer(tokens.NewRevocationLayer()),
    )

    issuer, err := m.factory.CreateIssuer(issuerSource)
    if err != nil {
        return nil, errors.Wrapf(err, "creating issuer for audience %s", audience)
    }

    m.cache[audience] = &issuerWrapper{
        source:    issuerSource,
        issuer:    issuer,
        expiresAt: expiresAt,
    }

    return issuer, nil
}
```

You can then drop `getIssuerFromCache` and `addIssuerToCache` entirely. Behavior is the same: first caller creates, later callers reuse and extend `expiresAt` if needed.

---

### 2. Simplify ticker lifecycle and shared state

You don’t really need to store `ticker` on the struct or mutate it in multiple places. Creating the ticker in `Start`, passing it into `purge`, and stopping it via `defer` in the goroutine removes the `nil` checks and shared mutable ticker state while keeping the `Stopper` usage intact.

```go
func (m *issuerManager) Start() {
    ticker := time.NewTicker(m.purgeInterval)
    go m.purge(ticker)
}

func (m *issuerManager) purge(ticker *time.Ticker) {
    defer m.stopper.Flow().ReportStopped()
    defer ticker.Stop()

    for {
        select {
        case now := <-ticker.C:
            m.purgeExpired(now)
        case <-m.stopper.Flow().StopRequested():
            return
        }
    }
}
```

With this, you can remove the `ticker *time.Ticker` field and the `if m.ticker == nil` guard; `Stop` remains the same.

---

### 3. Simplify `purgeExpired` (single pass, no slice)

You can delete from a map during iteration in Go. The two-phase collect-and-delete pattern adds allocations and complexity but isn’t necessary here.

```go
func (m *issuerManager) purgeExpired(now time.Time) {
    m.cacheMutex.Lock()
    defer m.cacheMutex.Unlock()

    for audience, wrapper := range m.cache {
        if wrapper.expiresAt.Before(now) {
            if err := m.factory.UnregisterSource(wrapper.source); err != nil {
                log.Errorf("Failed to unregister source for audience %s: %v", audience, err)
            }
            delete(m.cache, audience)
        }
    }
}
```

Same behavior, fewer moving parts and no intermediate slice.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +192 to +193
mockIssuerFactory := tokensMocks.NewMockIssuerFactory(mockCtrl)
mockIssuerFactory.EXPECT().CreateIssuer(gomock.Any(), gomock.Any()).Times(1).Return(mockIssuer, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add explicit tests for TargetAudience behavior (default and custom) in token generation

The updated GenerateTokenForPermissionsAndScope now uses req.TargetAudience to choose an issuer (defaulting to internalTokenId), but the tests still use gomock.Any() for the Source argument, so this behavior isn’t asserted.

Please add at least two subtests that verify the Source passed to CreateIssuer:

  1. When TargetAudience is empty, Source.ID() == internalTokenId.
  2. When TargetAudience is non-empty, Source.ID() == that audience.

You can implement this with a custom gomock.Matcher that inspects tokens.Source.ID(), so the audience-selection behavior is validated at the service level, not only in issuer_manager tests.

Comment on lines +144 to +146
assert.Empty(it, mgr.cache)
})
t.Run("Purge on a cache with a single non-expired item does not touch anything", func(it *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for purgeExpired behavior when UnregisterSource returns an error

Current tests only exercise the successful UnregisterSource path. Please add a subtest that configures the mock IssuerFactory to return a non-nil error from UnregisterSource, calls purgeExpired, and asserts that the expired entries are still removed from mgr.cache. This verifies that cleanup continues even when unregistering a source fails, matching the current implementation behavior.

Suggested change
assert.Empty(it, mgr.cache)
})
t.Run("Purge on a cache with a single non-expired item does not touch anything", func(it *testing.T) {
assert.Empty(it, mgr.cache)
})
t.Run("Purge on a cache with expired items continues cleanup when UnregisterSource fails", func(it *testing.T) {
mockCtrl := gomock.NewController(it)
mockIssuerFactory := tokenMocks.NewMockIssuerFactory(mockCtrl)
mockIssuer := tokenMocks.NewMockIssuer(mockCtrl)
mgr := newIssuerManager(mockIssuerFactory, testPurgeDelay)
old := testClock().Add(-1 * time.Minute)
veryOld := testClock().Add(-5 * time.Minute)
addIssuerToManager(mgr, fakeAudience, nil, mockIssuer, old)
addIssuerToManager(mgr, testAudience, nil, mockIssuer, veryOld)
assert.Len(it, mgr.cache, 2)
validateCacheEntry(it, mgr, fakeAudience, true, mockIssuer, old)
validateCacheEntry(it, mgr, testAudience, true, mockIssuer, veryOld)
mockIssuerFactory.EXPECT().UnregisterSource(gomock.Any()).Times(2).Return(assert.AnError)
mgr.purgeExpired(testClockTime)
assert.Empty(it, mgr.cache)
})
t.Run("Purge on a cache with a single non-expired item does not touch anything", func(it *testing.T) {

@rhybrillou rhybrillou force-pushed the master-yann/ROX-33256/internal-tokens-with-custom-audience branch from 2c41c78 to b8ade10 Compare March 5, 2026 10:37
@rhacs-bot
Copy link
Contributor

rhacs-bot commented Mar 5, 2026

Images are ready for the commit at 331116d.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-247-g331116deb1.

@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 87.50000% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.65%. Comparing base (092168e) to head (133aced).
⚠️ Report is 14 commits behind head on master.

Files with missing lines Patch % Lines
...entral/auth/internaltokens/service/service_impl.go 60.00% 3 Missing and 1 partial ⚠️
...tral/auth/internaltokens/service/issuer_manager.go 95.52% 2 Missing and 1 partial ⚠️
central/auth/internaltokens/service/singleton.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19304      +/-   ##
==========================================
+ Coverage   49.62%   49.65%   +0.02%     
==========================================
  Files        2680     2690      +10     
  Lines      202231   202572     +341     
==========================================
+ Hits       100366   100594     +228     
- Misses      94379    94480     +101     
- Partials     7486     7498      +12     
Flag Coverage Δ
go-unit-tests 49.65% <87.50%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants