-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
Description
Expected Behavior
Spark engine when enabled should materialize features to the online store.
Current Behavior
Materialization fails with the following error:
pyspark.errors.exceptions.base.PySparkTypeError: [CANNOT_INFER_SCHEMA_FOR_TYPE] Can not infer schema for type: `ChunkedArray`.
Steps to reproduce
See this repo that should reproduce the error locally for you. Ensure your JAVA_HOME is pointing to a java 17 installation and run make run to run the workflow. Optionally inspect the Makefile and run the commands yourself.
The repo has the default feast init repo created and will run a feast plan, feast apply and finally a materialize.py script which should trigger a Spark materialization job.
Specifications
- Version: 0.53.0
- Platform: Python 3.12, PySpark 3.5.5, Java 17
- Subsystem: MacOS 15.6.1
Possible Solution
I believe this happens when an arrow dataframe is converted to a Spark dataframe. Why this happens I am unsure, it may be my configuration or it may be a bug. If someone could advise on a solution that would be great.
Thanks!
danbaron63