Skip to content

fix: ODFV output projection in offline retrieval (#6099)#6140

Merged
ntkathole merged 1 commit intofeast-dev:masterfrom
jyejare:fix/odfv-output-projection-6099
Apr 2, 2026
Merged

fix: ODFV output projection in offline retrieval (#6099)#6140
ntkathole merged 1 commit intofeast-dev:masterfrom
jyejare:fix/odfv-output-projection-6099

Conversation

@jyejare
Copy link
Copy Markdown
Collaborator

@jyejare jyejare commented Mar 23, 2026

Summary

Fixes #6099 - Ensures offline retrieval honors ODFV feature projection, matching online retrieval behavior.

Problem

When requesting a subset of features from an OnDemandFeatureView:

  • Online retrieval ✅ Returns only requested features
  • Offline retrieval ❌ Returns ALL ODFV output features (before this fix)

This caused schema mismatches between training and serving pipelines.

Solution

Modified RetrievalJob.to_arrow() in offline_store.py to:

  1. Parse requested features from metadata.features
  2. Build a mapping of ODFV name → requested feature names
  3. Filter ODFV transformation output to only include requested columns

Example

Before this fix:

features = ["my_odfv:feature_a"]
offline_result = store.get_historical_features(features=features, ...)
# Columns: driver_id, event_timestamp, feature_a, feature_b, feature_c ❌

After this fix:

features = ["my_odfv:feature_a"]
offline_result = store.get_historical_features(features=features, ...)
# Columns: driver_id, event_timestamp, feature_a ✅

Changes

Modified: sdk/python/feast/infra/offline_stores/offline_store.py

  • Updated RetrievalJob.to_arrow() method (lines 140-184)
  • Added filtering logic for ODFV output projection
  • Maintains backward compatibility

Added: Test in sdk/python/tests/integration/offline_store/test_universal_historical_retrieval.py

  • test_odfv_projection() - Comprehensive test verifying:
    • Single feature request returns only that feature
    • Multiple feature request returns only requested features
    • Unrequested features are NOT included
    • Offline and online retrieval have consistent behavior
  • Parametrized for both full_feature_names=True and False

Testing

The new test test_odfv_projection verifies:

  1. ✅ Requesting 1 out of 3 ODFV features → returns only that 1 feature
  2. ✅ Requesting 2 out of 3 ODFV features → returns only those 2 features
  3. ✅ Unrequested features are NOT included in the result
  4. ✅ Offline and online retrieval return consistent schemas

Backward Compatibility

  • ✅ Falls back to old behavior if metadata is unavailable
  • ✅ No breaking changes to existing functionality
  • ✅ Only affects ODFV feature projection

Impact

This fix ensures:

  • ✅ Consistent behavior between online and offline retrieval
  • ✅ No schema mismatches in ML pipelines
  • ✅ More efficient - doesn't compute/return unnecessary features
  • ✅ Matches user expectations - returns exactly what was requested

Open with Devin

@jyejare jyejare requested review from a team as code owners March 23, 2026 08:28
@jyejare jyejare requested review from dmartinol, ejscribner and shuchu and removed request for a team March 23, 2026 08:28
@jyejare jyejare changed the title Fix ODFV output projection in offline retrieval (#6099) fix: ODFV output projection in offline retrieval (#6099) Mar 23, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

@jyejare jyejare marked this pull request as draft March 23, 2026 09:14
@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch from 6dc5107 to a6bbfda Compare March 23, 2026 15:10
@jyejare jyejare marked this pull request as ready for review March 23, 2026 15:10
@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch 6 times, most recently from b114f3b to f0bec1a Compare March 30, 2026 16:05
devin-ai-integration[bot]

This comment was marked as resolved.

@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch from f0bec1a to 607b640 Compare March 31, 2026 15:09
@ntkathole
Copy link
Copy Markdown
Member

@jyejare tests failing

@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch 4 times, most recently from 97a669e to 469cfbb Compare April 2, 2026 07:05
Changes:
- Modified RetrievalJob.to_arrow() to filter ODFV outputs based on requested
  features from metadata, matching online retrieval behavior
- Added test_odfv_projection to verify the fix and prevent regression

Signed-off-by: Jitendra Yejare <11752425+jyejare@users.noreply.github.com>
@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch from 469cfbb to f9a2751 Compare April 2, 2026 07:16
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 15 additional findings in Devin Review.

Open in Devin Review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Feature name collision validation bypassed for implicitly-added ODFV dependency refs

_validate_feature_refs at sdk/python/feast/feature_store.py:1476 is called with _feature_refs (the user's original refs), but the provider receives _feature_refs_for_provider which includes additional dependency FV refs added at lines 1454-1461. If full_feature_names=False and an ODFV dependency feature has the same name as a user-requested feature from a different FV (e.g., user requests fv1:conv_rate and the ODFV depends on fv2:conv_rate), the collision is not detected by validation, and both features produce identically-named output columns, leading to data corruption or errors in the offline store's point-in-time join.

(Refers to line 1476)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@ntkathole ntkathole merged commit b3dcde7 into feast-dev:master Apr 2, 2026
31 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_historical_features returns all ODFV output columns even when a single ODFV feature is requested

3 participants