-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Expected Behavior
When requesting a single feature from an OnDemandFeatureView, both online and offline retrieval should return only the explicitly requested output feature.
Example:
features=["odfv:feature_a"]Expected result:
online returns only feature_a
offline returns only feature_a
The requested feature list should define the output schema consistently across both retrieval paths.
Current Behavior
For OnDemandFeatureViews, offline retrieval via get_historical_features() returns all output columns defined by the ODFV, even when only a single ODFV feature is requested.
Example:
features=["odfv:feature_a"]Actual result:
online returns only feature_a
offline returns feature_a, feature_b, feature_c, etc. (all ODFV outputs)
This creates a schema inconsistency between online and offline retrieval.
Important clarification:
this is not about extra input columns from RequestSource or source FeatureViews being present in offline retrieval.
This issue is specifically about offline retrieval not honoring projection of ODFV output features.
Steps to reproduce
- Define an OnDemandFeatureView with multiple output features, for example:
- feature_a
- feature_b
- feature_c
- Request a single feature from that ODFV in offline retrieval:
store.get_historical_features(
entity_df=entity_df,
features=["odfv:feature_a"],
)- Observe that the offline result contains all output columns from the ODFV instead of only feature_a.
- Compare with online retrieval for the same requested feature set and observe that online returns only the requested feature.
Specifications
- Version: latest
- Platform: na
- Subsystem: Offline retrieval / OnDemandFeatureView / get_historical_features
Possible Solution
get_historical_features() should honor the requested output feature subset for OnDemandFeatureViews in the same way as online retrieval.
If this is intentional behavior, the documentation should explicitly state that offline retrieval for ODFVs returns the full ODFV output schema even when only a subset of ODFV features is requested.
However, from a user perspective, consistent projection behavior between online and offline retrieval would be much more intuitive and would prevent schema mismatches in downstream training and serving pipelines.
Related
This issue appears related to, but different from, a few existing Feast topics:
#4479 — OnDemandFeatureViews with RequestSource returns different columns depending on online/offline feature retrieval
That issue discusses extra input columns (for example from RequestSource) appearing in offline retrieval.
My issue is different: even when I request a single ODFV output feature, offline retrieval returns all ODFV output features, while online retrieval returns only the requested one.
#5890 — Provider get_historical_features API cleanup: remove ad-hoc args and formalize ODFV handling
This seems relevant because it explicitly describes historical retrieval for ODFVs as having an unclear contract and mentions that requested ODFVs were historically not passed explicitly through the provider interface. That may be related to why output projection for ODFVs is not handled consistently in the offline path.
#5803 — feat: Pass requested On-Demand Feature Views to Provider get_historical_features
I have not verified whether this fully addresses the problem described here, but it looks closely related because it passes requested ODFVs explicitly into historical retrieval.