Skip to content

feat: Add UUID and TIME_UUID as feature types (#5885)#5951

Merged
ntkathole merged 15 commits intofeast-dev:masterfrom
soooojinlee:feat/add-uuid-feature-types
Apr 1, 2026
Merged

feat: Add UUID and TIME_UUID as feature types (#5885)#5951
ntkathole merged 15 commits intofeast-dev:masterfrom
soooojinlee:feat/add-uuid-feature-types

Conversation

@soooojinlee
Copy link
Copy Markdown
Contributor

@soooojinlee soooojinlee commented Feb 8, 2026

What this PR does / why we need it:

Adds UUID and TIME_UUID as native Feast feature types, resolving #5885. Currently UUID values must be stored as STRING, which loses type semantics, prevents backend-specific features (e.g. Cassandra timeuuid range queries), and makes PostgreSQL uuid columns infer as STRING. This PR enables users to declare UUID features with Field(name="user_id", dtype=Uuid) and receive uuid.UUID objects from get_online_features().to_dict().

Design Decisions

Why two types (UUID vs TIME_UUID)?
The issue author explicitly requested distinguishing time-based UUID (uuid1) and random UUID (uuid4). Both serialize
identically to string in proto, but separate types allow expressing intent in feature definitions and enable future backend-specific optimizations.

Why dedicated proto fields (uuid_val, time_uuid_val)?
Following the pattern established by SET types (PR #5888) and UNIX_TIMESTAMP (which reuses int64/Int64List), we add dedicated oneof fields that reuse existing proto scalar types (string and StringList). This allows WhichOneof("val") to identify UUID types directly from the proto message, without requiring a side-channel.

Backward compatibility for data stored before this change:
OnlineResponse accepts an optional feature_types dict. When data was previously stored as string_val, this metadata enables feast_value_type_to_python_type() to convert it to uuid.UUID. New materializations use uuid_val/time_uuid_val and are identified automatically.

Changes

Layer Files Description
Proto Value.proto, generated *_pb2.py/*_pb2.pyi Add UUID=30, TIME_UUID=31, UUID_LIST=32, TIME_UUID_LIST=33 to ValueType.Enum; add uuid_val, time_uuid_val, uuid_list_val, time_uuid_list_val to Value.oneof
Type system value_type.py, types.py Add UUID, TIME_UUID, UUID_LIST, TIME_UUID_LIST enums and Uuid/TimeUuid aliases
Type conversion type_map.py Add mappings to ~11 conversion dicts (proto, PyArrow, pandas, PostgreSQL, Couchbase, Snowflake); switch from string_val to uuid_val; add PROTO_VALUE_TO_VALUE_TYPE_MAP entries for UUID fields
Online response online_response.py, online_store.py, feature_store.py, utils.py Pass feature_types metadata for backward-compatible deserialization
ODFV on_demand_feature_view.py Add UUID/TIME_UUID sample values for schema inference

Backward Compatibility

  • Data previously stored as string_val still deserializes correctly via the feature_types side-channel
  • New materializations use dedicated uuid_val/time_uuid_val proto fields
  • feast_value_type_to_python_type(v) without feature_type now returns uuid.UUID for uuid_val fields (previously returned plain string for string_val)
  • PostgreSQL uuid columns now infer as ValueType.UUID (previously ValueType.STRING)
  • Go SDK: proto changes compile without errors; UUID handling logic is not implemented (out of scope)

Tests

  • test_types.py: Uuid/TimeUuid ↔ ValueType bidirectional conversion, Array types
  • test_type_map.py: Proto roundtrip with uuid_val, uuid.UUID object return, backward compatibility for string_val, UUID list roundtrip, PostgreSQL mapping
  • All 78 unit tests passing
  • ruff lint and format checks passing

@soooojinlee soooojinlee requested review from a team as code owners February 8, 2026 14:55
@soooojinlee soooojinlee requested review from ejscribner, robhowley and tokoko and removed request for a team February 8, 2026 14:56
devin-ai-integration[bot]

This comment was marked as resolved.

@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch from 1d4cd01 to 4a5c932 Compare February 8, 2026 15:00
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@nquinn408
Copy link
Copy Markdown
Contributor

@soooojinlee , thanks so much for putting this together! Can you rebase to bring this PR up to date?

@ntkathole ntkathole force-pushed the feat/add-uuid-feature-types branch from 2c56521 to 1fd106a Compare February 11, 2026 04:44
@ntkathole ntkathole force-pushed the feat/add-uuid-feature-types branch 3 times, most recently from 26198a6 to 525ac72 Compare February 11, 2026 14:14
@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch 3 times, most recently from cb6dd44 to 54c2eae Compare February 12, 2026 05:55
@ntkathole ntkathole force-pushed the feat/add-uuid-feature-types branch from 2dd648a to b280364 Compare February 16, 2026 06:44
@nquinn408
Copy link
Copy Markdown
Contributor

@soooojinlee , can you also add support for UUID_SET and TIME_UUID_SET?

@nquinn408
Copy link
Copy Markdown
Contributor

@soooojinlee , can you also update the docs with the newly supported types?

@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch 2 times, most recently from 180434b to 41cfe7e Compare February 17, 2026 04:26
@soooojinlee
Copy link
Copy Markdown
Contributor Author

@soooojinlee , can you also add support for UUID_SET and TIME_UUID_SET?

@soooojinlee , can you also update the docs with the newly supported types?

I added UUID_SET / TIME_UUID_SET support and updated the type system documentation as requested in here 41cfe7e

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@ntkathole
Copy link
Copy Markdown
Member

@soooojinlee please resolve the conflicts

@soooojinlee soooojinlee force-pushed the feat/add-uuid-feature-types branch 2 times, most recently from 07fbde4 to 6aa853a Compare March 27, 2026 14:49
@soooojinlee soooojinlee requested a review from ntkathole March 27, 2026 14:52
@nquinn408
Copy link
Copy Markdown
Contributor

@soooojinlee , can we update this and merge it before the VALUE_SET and VALUE_LIST change?

Copy link
Copy Markdown
Contributor

@nquinn408 nquinn408 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

soooojinlee and others added 15 commits April 1, 2026 18:01
Signed-off-by: soojin <soojin@dable.io>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Add uuid_val, time_uuid_val, uuid_list_val, time_uuid_list_val as
dedicated oneof fields in the Value proto message, replacing the
previous reuse of string_val/string_list_val. This allows UUID types
to be identified from the proto field alone without requiring a
feature_types side-channel. Backward compatibility is maintained for
data previously stored as string_val.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Add Set(Uuid) and Set(TimeUuid) as feature types with full roundtrip
support, backward compatibility, and documentation for all UUID types.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ype mappings

Keep PDF_BYTES=30 and IMAGE_BYTES=31 at their upstream values instead of
renumbering them. Shift UUID types to 32-37 in both proto and Python enum.

Also add missing SET type entries in _convert_value_type_str_to_value_type(),
convert_array_column(), and _get_sample_values_by_type() for completeness.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The comment claimed Sets do not support UUID/TimeUuid but the code
intentionally allows them. Updated to reflect actual behavior.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…o top

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rialization

Return UUID proto fields as plain strings instead of falling through to
feast_value_type_to_python_type which converts them to uuid.UUID objects
that are not JSON-serializable, causing TypeError during HTTP transport.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Add [misc] error code to type: ignore comments in UUID list/set
proto conversion to satisfy mypy's stricter checking.

Signed-off-by: Soojin Lee <soooojin.lee@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
@ntkathole ntkathole force-pushed the feat/add-uuid-feature-types branch from 3fcfead to 7b600cb Compare April 1, 2026 12:31
@ntkathole ntkathole merged commit 5d6e311 into feast-dev:master Apr 1, 2026
28 of 29 checks passed
yuan1j pushed a commit to yuan1j/feast that referenced this pull request Apr 2, 2026
…-dev#5951)

* feat: Add UUID and TIME_UUID as feature types (feast-dev#5885)

Signed-off-by: soojin <soojin@dable.io>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

* test: Add unit tests for UUID type support

Signed-off-by: soojin <soojin@dable.io>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

* style: Fix ruff lint and formatting issues

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

* feat: Add dedicated UUID/TIME_UUID proto fields to Value.proto

Add uuid_val, time_uuid_val, uuid_list_val, time_uuid_list_val as
dedicated oneof fields in the Value proto message, replacing the
previous reuse of string_val/string_list_val. This allows UUID types
to be identified from the proto field alone without requiring a
feature_types side-channel. Backward compatibility is maintained for
data previously stored as string_val.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

* fix: Address review feedback for UUID type support

Signed-off-by: soojin <soojin@dable.io>

* fix: Address review feedback for UUID type support

Signed-off-by: soojin <soojin@dable.io>

* fix: Address review feedback

Signed-off-by: soojin <soojin@dable.io>

* fix: Convert uuid.UUID to string for Arrow and JSON serialization

Signed-off-by: soojin <soojin@dable.io>

* feat: Add UUID_SET/TIME_UUID_SET support and update type system docs

Add Set(Uuid) and Set(TimeUuid) as feature types with full roundtrip
support, backward compatibility, and documentation for all UUID types.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: Preserve PDF_BYTES/IMAGE_BYTES enum values and add missing SET type mappings

Keep PDF_BYTES=30 and IMAGE_BYTES=31 at their upstream values instead of
renumbering them. Shift UUID types to 32-37 in both proto and Python enum.

Also add missing SET type entries in _convert_value_type_str_to_value_type(),
convert_array_column(), and _get_sample_values_by_type() for completeness.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: Correct misleading comment in Set.__init__

The comment claimed Sets do not support UUID/TimeUuid but the code
intentionally allows them. Updated to reflect actual behavior.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: Extract UUID Arrow conversion into helper and move import to top

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: Handle UUID types in _proto_value_to_transport_value for JSON serialization

Return UUID proto fields as plain strings instead of falling through to
feast_value_type_to_python_type which converts them to uuid.UUID objects
that are not JSON-serializable, causing TypeError during HTTP transport.

Signed-off-by: soojin <soojin@dable.io>

* chore: Regenerate protobuf files with UUID type support

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

* fix: Fix mypy type ignore comments for UUID collection conversions

Add [misc] error code to type: ignore comments in UUID list/set
proto conversion to satisfy mypy's stricter checking.

Signed-off-by: Soojin Lee <soooojin.lee@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

---------

Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: Soojin Lee <soooojin.lee@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: yuanjun220 <1069645408@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants