Skip to content

ROX-31236: Remove ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT flag#17236

Merged
vikin91 merged 5 commits intomasterfrom
piotr/ROX-31236-remove-prevent-sensor-restart-flag
Oct 13, 2025
Merged

ROX-31236: Remove ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT flag#17236
vikin91 merged 5 commits intomasterfrom
piotr/ROX-31236-remove-prevent-sensor-restart-flag

Conversation

@vikin91
Copy link
Contributor

@vikin91 vikin91 commented Oct 10, 2025

Human here

This has been created fully by AI. I have checked the code changes and the PR description.
The description could have been more compact, but I think the goal of this PR is clear.

Description

This PR removes the ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT feature flag that has been enabled by default since Release 4.2 (December 2023).

📋 Detailed Implementation Plan: REMOVAL_PLAN_PREVENT_SENSOR_RESTART.md

Why is this safe to remove?

  • Default enabled since: Release 4.2 (Dec 2023) - over 22 months in production
  • No runtime override: Flag marked as unchangeableInProd - cannot be disabled in production
  • Stable behavior: Connection retry mechanism has been standard operational mode for 1.5+ years
  • Foundation for other features: This enables subsequent removal of ROX_CAPTURE_INTERMEDIATE_EVENTS and ROX_SENSOR_RECONCILIATION
  • Test coverage preserved: Integration tests continue to validate reconnection scenarios

Changes made:

  • Removed feature flag definition from pkg/features/list.go
  • Simplified Sensor Start() to always use connection retry logic
  • Simplified Sensor Stop() to use retry-aware stop mechanism
  • Removed feature flag setup from 3 integration tests (runtime, reconciliation, alerts)
  • Removed unused communicationWithCentral function

The retry-enabled behavior is now the only operational mode. The legacy "restart-on-disconnect" code path has been completely removed.

PR Stacking Context

This is the first PR of a 4-PR feature flag cleanup sequence:

  1. This PR: ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT (foundation layer)
  2. Parallel with this: ROX_AUTH_MACHINE_TO_MACHINE (independent)
  3. After this PR merges: ROX_CAPTURE_INTERMEDIATE_EVENTS (depends on this)
  4. After ROX_CAPTURE_INTERMEDIATE_EVENTS: ROX_SENSOR_RECONCILIATION (depends on first and third PRs)

User-facing documentation

  • CHANGELOG.md is updated OR update is not needed - Not needed: internal cleanup, no user-facing impact
  • documentation PR is created and is linked above OR is not needed - Not needed: behavior unchanged

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag - Removal of an always-enabled flag
  • CI results are inspected

Automated testing

  • modified existing tests - Simplified test setup by removing flag enablement

How I validated my change

  • Verified that all integration tests that previously enabled this flag now run without it
  • Confirmed linter passes on all modified files
  • Tests continue to validate:
    • Connection retry behavior
    • Sensor reconnection without restart
    • K8s event reconciliation after reconnect
    • Alert transmission after connection restart
  • The removed code paths (non-retry mode) were already unreachable in production due to unchangeableInProd constraint

Suggested Reviewers

  • Sensor team members
  • Anyone familiar with Sensor-Central connection management

This removes the ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT feature flag that
has been enabled by default since Release 4.2 (Dec 2023).

The feature flag enabled Sensor to maintain connection retries with Central
instead of restarting the entire Sensor process on disconnect. This behavior
is now the standard and only operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Simplified sensor Start() to always use connection retry logic
- Simplified sensor Stop() to use the retry-aware stop mechanism
- Removed feature flag setup from 3 integration tests (runtime, reconciliation, alerts)

The retry-enabled behavior is preserved as the default and only mode.
Tests continue to validate reconnection scenarios without needing
explicit feature flag enablement.

Partially generated with AI assistance.
@openshift-ci
Copy link

openshift-ci bot commented Oct 10, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@rhacs-bot
Copy link
Contributor

rhacs-bot commented Oct 10, 2025

Images are ready for the commit at 0915659.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.9.x-1047-g09156590c7.

The communicationWithCentral function was no longer called after removing
the feature flag conditional logic. This fixes the golangci-lint unused
function warning.
This document provides the detailed analysis and implementation plan for
removing the ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT feature flag.
This document is for reference only and should not be part of the codebase.
@vikin91 vikin91 changed the title ROX-31236: Remove ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT flag ROX-31236: Remove ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT flag Oct 10, 2025
@codecov
Copy link

codecov bot commented Oct 10, 2025

Codecov Report

❌ Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.84%. Comparing base (9598e78) to head (0915659).

Files with missing lines Patch % Lines
sensor/common/sensor/sensor.go 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #17236      +/-   ##
==========================================
+ Coverage   48.81%   48.84%   +0.03%     
==========================================
  Files        2717     2717              
  Lines      203219   203182      -37     
==========================================
+ Hits        99200    99251      +51     
+ Misses      96196    96114      -82     
+ Partials     7823     7817       -6     
Flag Coverage Δ
go-unit-tests 48.84% <0.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

vikin91 added a commit that referenced this pull request Oct 10, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
vikin91 added a commit that referenced this pull request Oct 10, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
vikin91 added a commit that referenced this pull request Oct 10, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
vikin91 added a commit that referenced this pull request Oct 10, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
vikin91 added a commit that referenced this pull request Oct 10, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
@vikin91 vikin91 marked this pull request as ready for review October 10, 2025 14:24
@vikin91 vikin91 requested a review from a team as a code owner October 10, 2025 14:24
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Remove any remaining references to ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT in documentation, scripts, and config templates to avoid confusion.
  • Since the retry logic is now the only execution path, consider renaming communicationWithCentralWithRetries to a more generic name (e.g., startCentralCommunication) for clarity.
  • Clean up or update any stale log messages and metrics around the old restart-on-disconnect behavior so they accurately reflect the current retry‐only flow.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Remove any remaining references to ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT in documentation, scripts, and config templates to avoid confusion.
- Since the retry logic is now the only execution path, consider renaming communicationWithCentralWithRetries to a more generic name (e.g., startCentralCommunication) for clarity.
- Clean up or update any stale log messages and metrics around the old restart-on-disconnect behavior so they accurately reflect the current retry‐only flow.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@vikin91
Copy link
Contributor Author

vikin91 commented Oct 10, 2025

@sourcery-ai

  • Remove any remaining references to ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT in documentation, scripts, and config templates to avoid confusion.

I checked for that and it should be done.

  • Since the retry logic is now the only execution path, consider renaming communicationWithCentralWithRetries to a more generic name (e.g., startCentralCommunication) for clarity.

I would like to keep the changes minimal and don't do the refactoring when only removal of FF was planned. I think the name is still fine.

  • Clean up or update any stale log messages and metrics around the old restart-on-disconnect behavior so they accurately reflect the current retry‐only flow.

I didn't see any. If there is something to clean up, let me know.

@vikin91 vikin91 requested a review from lvalerom October 10, 2025 14:39
@openshift-ci
Copy link

openshift-ci bot commented Oct 10, 2025

@vikin91: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ocp-4-19-ui-e2e-tests 0915659 link false /test ocp-4-19-ui-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@vikin91 vikin91 merged commit ba34cfc into master Oct 13, 2025
97 of 99 checks passed
@vikin91 vikin91 deleted the piotr/ROX-31236-remove-prevent-sensor-restart-flag branch October 13, 2025 15:27
vikin91 added a commit that referenced this pull request Oct 13, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
vikin91 added a commit that referenced this pull request Oct 14, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
vikin91 added a commit that referenced this pull request Oct 16, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
vikin91 added a commit that referenced this pull request Oct 16, 2025
This removes the ROX_CAPTURE_INTERMEDIATE_EVENTS feature flag that has been
enabled by default since Release 4.3 (February 2024).

The flag enabled offline v3 behavior where Sensor buffers events during
Central disconnects. This is now the standard operational mode.

Changes:
- Removed feature flag definition from pkg/features/list.go
- Network flow manager: Always use buffered channels and continue enrichment during offline mode
- Process signal pipeline: Always use long-lived contexts that persist across disconnects
- Detector: Always use buffered queues and handle pause/resume on connection events
- Removed EnrichmentReasonEpFeatureDisabled (no longer returned)
- Updated integration tests to remove flag setup
- Removed unused features package imports from 9 files

All event buffering and offline operation functionality remains fully operational.

This PR is stacked on #17236 (ROX_PREVENT_SENSOR_RESTART_ON_DISCONNECT removal).

Partially generated with AI assistance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants