Skip to content

fix: ODFV output projection in offline retrieval (#6099)#6140

Open
jyejare wants to merge 2 commits intofeast-dev:masterfrom
jyejare:fix/odfv-output-projection-6099
Open

fix: ODFV output projection in offline retrieval (#6099)#6140
jyejare wants to merge 2 commits intofeast-dev:masterfrom
jyejare:fix/odfv-output-projection-6099

Conversation

@jyejare
Copy link
Copy Markdown
Collaborator

@jyejare jyejare commented Mar 23, 2026

Summary

Fixes #6099 - Ensures offline retrieval honors ODFV feature projection, matching online retrieval behavior.

Problem

When requesting a subset of features from an OnDemandFeatureView:

  • Online retrieval ✅ Returns only requested features
  • Offline retrieval ❌ Returns ALL ODFV output features (before this fix)

This caused schema mismatches between training and serving pipelines.

Solution

Modified RetrievalJob.to_arrow() in offline_store.py to:

  1. Parse requested features from metadata.features
  2. Build a mapping of ODFV name → requested feature names
  3. Filter ODFV transformation output to only include requested columns

Example

Before this fix:

features = ["my_odfv:feature_a"]
offline_result = store.get_historical_features(features=features, ...)
# Columns: driver_id, event_timestamp, feature_a, feature_b, feature_c ❌

After this fix:

features = ["my_odfv:feature_a"]
offline_result = store.get_historical_features(features=features, ...)
# Columns: driver_id, event_timestamp, feature_a ✅

Changes

Modified: sdk/python/feast/infra/offline_stores/offline_store.py

  • Updated RetrievalJob.to_arrow() method (lines 140-184)
  • Added filtering logic for ODFV output projection
  • Maintains backward compatibility

Added: Test in sdk/python/tests/integration/offline_store/test_universal_historical_retrieval.py

  • test_odfv_projection() - Comprehensive test verifying:
    • Single feature request returns only that feature
    • Multiple feature request returns only requested features
    • Unrequested features are NOT included
    • Offline and online retrieval have consistent behavior
  • Parametrized for both full_feature_names=True and False

Testing

The new test test_odfv_projection verifies:

  1. ✅ Requesting 1 out of 3 ODFV features → returns only that 1 feature
  2. ✅ Requesting 2 out of 3 ODFV features → returns only those 2 features
  3. ✅ Unrequested features are NOT included in the result
  4. ✅ Offline and online retrieval return consistent schemas

Backward Compatibility

  • ✅ Falls back to old behavior if metadata is unavailable
  • ✅ No breaking changes to existing functionality
  • ✅ Only affects ODFV feature projection

Impact

This fix ensures:

  • ✅ Consistent behavior between online and offline retrieval
  • ✅ No schema mismatches in ML pipelines
  • ✅ More efficient - doesn't compute/return unnecessary features
  • ✅ Matches user expectations - returns exactly what was requested

Open with Devin

@jyejare jyejare requested review from a team as code owners March 23, 2026 08:28
@jyejare jyejare requested review from dmartinol, ejscribner and shuchu and removed request for a team March 23, 2026 08:28
@jyejare jyejare changed the title Fix ODFV output projection in offline retrieval (#6099) fix: ODFV output projection in offline retrieval (#6099) Mar 23, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

@jyejare jyejare marked this pull request as draft March 23, 2026 09:14
@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch from 6dc5107 to a6bbfda Compare March 23, 2026 15:10
@jyejare jyejare marked this pull request as ready for review March 23, 2026 15:10
@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch 6 times, most recently from b114f3b to f0bec1a Compare March 30, 2026 16:05
devin-ai-integration[bot]

This comment was marked as resolved.

@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch from f0bec1a to 607b640 Compare March 31, 2026 15:09
@ntkathole
Copy link
Copy Markdown
Member

@jyejare tests failing

@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch 2 times, most recently from 9789bd1 to 150d2fc Compare April 1, 2026 16:23
Ambient Code Bot added 2 commits April 1, 2026 21:53
Changes:
- Modified RetrievalJob.to_arrow() to filter ODFV outputs based on requested
  features from metadata, matching online retrieval behavior
- Added test_odfv_projection to verify the fix and prevent regression

Before this fix:
- Online: features=['odfv:feature_a'] -> returns feature_a only ✓
- Offline: features=['odfv:feature_a'] -> returns feature_a, feature_b, feature_c ✗

After this fix:
- Both online and offline return only the requested features ✓

This ensures schema consistency between training (offline) and serving (online)
pipelines, preventing downstream issues in ML workflows.

Fixes feast-dev#6099

Signed-off-by: Jitendra Yejare <11752425+jyejare@users.noreply.github.com>
Changes:
- Modified RetrievalJob.to_arrow() to filter ODFV outputs based on requested
  features from metadata, matching online retrieval behavior
- Added test_odfv_projection to verify the fix and prevent regression

Before this fix:
- Online: features=['odfv:feature_a'] -> returns feature_a only ✓
- Offline: features=['odfv:feature_a'] -> returns feature_a, feature_b, feature_c ✗

After this fix:
- Both online and offline return only the requested features ✓

This ensures schema consistency between training (offline) and serving (online)
pipelines, preventing downstream issues in ML workflows.

Fixes feast-dev#6099

Signed-off-by: Jitendra Yejare <11752425+jyejare@users.noreply.github.com>
@jyejare jyejare force-pushed the fix/odfv-output-projection-6099 branch from 150d2fc to 97a669e Compare April 1, 2026 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_historical_features returns all ODFV output columns even when a single ODFV feature is requested

3 participants