Skip to content

ROX-33816: support soft-deletes for deployments#19621

Draft
stehessel wants to merge 22 commits intomasterfrom
deployment-tombstones
Draft

ROX-33816: support soft-deletes for deployments#19621
stehessel wants to merge 22 commits intomasterfrom
deployment-tombstones

Conversation

@stehessel
Copy link
Copy Markdown
Collaborator

Description

change me!

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

change me!

stehessel and others added 15 commits March 25, 2026 22:17
Add storage.Tombstone message, extend Deployment with tombstone field,
add TOMBSTONED to ViolationState enum, add TombstoneRetentionConfig to
PrivateConfig for configurable retention duration.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…tion constant

Add ROX_DEPLOYMENT_TOMBSTONES feature flag (disabled by default),
TombstoneDeletedAt and TombstoneExpiresAt search field labels, and
DefaultTombstoneRetentionDays=30 constant for the soft-delete feature.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Fix fragment comment in tombstone.proto, capitalization in deployment.proto,
and add missing next-available-tag comment to TombstoneRetentionConfig.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…g, and constant comment

Make feature description match verbosity of peers, add section comment
before tombstone search field labels, and simplify retention constant comment.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Add tombstone_deletedat and tombstone_expiresat nullable timestamp columns
to the deployments table. NULL values indicate active deployments.
The btree index on tombstone_expiresat supports efficient pruner queries.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Add AlertAndNotifyTombstoned() to AlertManager interface and implementation.
Add DeploymentTombstoned() to lifecycle Manager interface and implementation.
TOMBSTONED alerts are not notified; they remain queryable until TTL expiry.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Implements soft-deletion via tombstone field upsert. Adds excludeTombstonedFilter()
applied to all user-facing query paths (Search, Count, SearchListDeployments, etc.).
GetDeploymentIDs() bypasses the filter for orphan detection.
Adds SearchTombstonedDeployments() for the pruner.
Adds resurrection logic in upsertDeployment() to clear stale tombstones.
Extracts removeOperationalData() helper to deduplicate cleanup logic.

Also adds stub generated types (storage.Tombstone + Deployment.Tombstone field)
for compilation until make proto-generated-srcs can be run with protoc toolchain.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Pre-generation stub to unblock compilation before make proto-generated-srcs runs.
This will be replaced by the actual generated constant when proto generation runs.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…mments

Add read access control to SearchTombstonedDeployments, add comments
explaining cache ordering in TombstoneDeployment and the false parameter
in SearchRawAlerts.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ection

Exposes soft-deletion timestamp in list responses for UI badge rendering
and "Show deleted" toggle support. Populated from the tombstone_deletedat
database column via the ListDeploymentView projection layer.

Since proto code generation has not been run yet, the new field and its
getter are added as a manual stub to the generated file, following the
same conventions as the existing Created timestamp field.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Queries expired tombstoned deployments (tombstone_expires_at < now()),
transitions their TOMBSTONED alerts to RESOLVED (without notification),
then hard-deletes the deployment record. Wired into pruneBasedOnConfig()
alongside other retention-based pruning.

The expiry timestamp is pre-computed at soft-delete time so no config
read is needed at prune time. Active deployments are excluded naturally
since NULL timestamps never satisfy the less-than comparison. Skips
entire deployment on any per-record error to avoid partial cleanup.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
When DeploymentTombstones feature flag is enabled and TTL > 0,
runRemovePipeline calls TombstoneDeployment (soft-delete) instead of
RemoveDeployment (hard-delete). getTombstoneTTL reads the retention
duration from the system private config via configDatastore.Singleton(),
falling back to DefaultTombstoneRetentionDays when the config is absent.

Guards the DeploymentRemoved call in the alerts pipeline so it is
skipped when tombstoning is active; the deployment pipeline already
transitions alerts to TOMBSTONED state via DeploymentTombstoned(),
and calling DeploymentRemoved() would overwrite that with RESOLVED.

Also adds a pre-generation stub for TombstoneRetentionConfig and its
getter to generated/storage/config.pb.go to unblock compilation before
make proto-generated-srcs runs.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Add TombstoneDeletedAt to objects.ToListDeployment() projection
- Log warning when tombstone TTL config read fails in pipeline
- Add comment explaining alerts pipeline feature flag guard
- Add context timeout to tombstone pruner operations

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Tests cover:
- AlertAndNotifyTombstoned transitions ACTIVE/ATTEMPTED alerts to TOMBSTONED without notifications
- DeploymentTombstoned calls alert manager and cleans up observation queue
- runRemovePipeline routes to TombstoneDeployment when flag enabled and TTL > 0
- Alerts pipeline guard skips DeploymentRemoved when tombstoning is active

Refactored pipelineImpl to accept an injected configDatastore.DataStore field
instead of calling configDatastore.Singleton() directly inside getTombstoneTTL.
This avoids triggering a live database connection during unit tests.

User request: implement deployment tombstone feature for compliance audit.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…o verify state filter

Inspect the actual query to assert ACTIVE and ATTEMPTED state filters are present,
verifying that RESOLVED alerts would be excluded at the query layer rather than
just testing the no-op behavior on an empty result set.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 25, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@rhacs-bot
Copy link
Copy Markdown
Contributor

rhacs-bot commented Mar 25, 2026

Images are ready for the commit at 3ccd001.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-460-g3ccd001321.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

❌ Patch coverage is 21.08844% with 232 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.23%. Comparing base (8673b4b) to head (3ccd001).
⚠️ Report is 16 commits behind head on master.

Files with missing lines Patch % Lines
central/deployment/datastore/datastore_impl.go 8.41% 94 Missing and 4 partials ⚠️
central/pruning/pruning.go 0.00% 61 Missing ⚠️
central/graphql/resolvers/generated.go 11.62% 38 Missing ⚠️
...nsor/service/pipeline/deploymentevents/pipeline.go 48.57% 13 Missing and 5 partials ⚠️
pkg/search/queryutils.go 0.00% 11 Missing ⚠️
...ntral/detection/alertmanager/alert_manager_impl.go 86.36% 2 Missing and 1 partial ⚠️
central/sensor/service/pipeline/alerts/pipeline.go 25.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19621      +/-   ##
==========================================
- Coverage   49.28%   49.23%   -0.05%     
==========================================
  Files        2735     2736       +1     
  Lines      206215   206550     +335     
==========================================
+ Hits       101632   101700      +68     
- Misses      97041    97297     +256     
- Partials     7542     7553      +11     
Flag Coverage Δ
go-unit-tests 49.23% <21.08%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@stehessel stehessel changed the title Deployment tombstones ROX-33816: support soft-deletes for deployments Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants