Skip to content

Read On-Demand Feature View and deserialization while pushing data #5893

@max36067

Description

@max36067

Description

A ModuleNotFoundError occurs when calling store.push() to ingest data into the Online Store. The error is triggered when Feast attempts to synchronize the registry and encounters an On-Demand Feature View.

Because Feast uses dill to serialize/deserialize User Defined Functions (UDFs), it fails if the execution environment lacks the specific Python module (in this case, training) that was present when the UDF was originally defined and registered.
🔍 Error Traceback

File "/usr/local/lib/python3.12/site-packages/data_dataflow/core/io.py", line 201, in process
    self.store.push(push_source_name, df, to=PushMode.ONLINE)  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 1698, in push
    self.write_to_online_store(
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 1946, in write_to_online_store
    feature_view, df = self._get_feature_view_and_df_for_online_write(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 1904, in _get_feature_view_and_df_for_online_write
    for fv_proto in self.list_all_feature_views(allow_registry_cache)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 294, in list_all_feature_views
    return self._list_all_feature_views(allow_cache, tags=tags)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 269, in _list_all_feature_views
    for fv in self.registry.list_all_feature_views(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/registry.py", line 647, in list_all_feature_views
    return proto_registry_utils.list_all_feature_views(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/proto_registry_utils.py", line 67, in wrapper
    cache_value = func(registry_proto, project, tags)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/proto_registry_utils.py", line 243, in list_all_feature_views
    + list_on_demand_feature_views(registry_proto, project, tags)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/proto_registry_utils.py", line 67, in wrapper
    cache_value = func(registry_proto, project, tags)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/proto_registry_utils.py", line 285, in list_on_demand_feature_views
    OnDemandFeatureView.from_proto(on_demand_feature_view)
  File "/usr/local/lib/python3.12/site-packages/feast/on_demand_feature_view.py", line 400, in from_proto
    transformation = PandasTransformation.from_proto(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/transformation/pandas_transformation.py", line 150, in from_proto
    udf=dill.loads(user_defined_function_proto.body),
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dill/_dill.py", line 311, in loads
    return load(file, ignore, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dill/_dill.py", line 297, in load
    return Unpickler(file, ignore=ignore, **kwds).load()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dill/_dill.py", line 452, in load
    obj = StockUnpickler.load(self)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dill/_dill.py", line 442, in find_class
    return StockUnpickler.find_class(self, module, name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'training'

Steps to Reproduce

- Define an OnDemandFeatureView in an environment where put transformation code to a local module.

- The UDF inside this view references a function, class, or constant from the training module.

- Run feast apply to save the definition to the remote registry (e.g., S3, GCS, or SQL).

- Run store.push() from a different environment that does not have the training module installed in its Python path.

Root Cause Analysis

This is a serialization dependency issue. When dill (the library Feast uses for pickling) serializes a function, it often stores references to the modules where global variables or dependencies reside. During store.push(), Feast initializes the feature store by reading the registry. When it hits an On-Demand Feature View, it tries to "unpickle" the UDF. If the training module is missing from the current sys.path, the unpickling process fails.

Environment Context

Python Version: 3.12
Feature Store: Feast
Component: dataflow

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions