Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 43 additions & 2 deletions docs/reference/type-system.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
## Motivation

Feast uses an internal type system to provide guarantees on training and serving data.
Feast supports primitive types, array types, and map types for feature values.
Feast supports primitive types, array types, set types, and map types for feature values.
Null types are not supported, although the `UNIX_TIMESTAMP` type is nullable.
The type system is controlled by [`Value.proto`](https://github.com/feast-dev/feast/blob/master/protos/feast/types/Value.proto) in protobuf and by [`types.py`](https://github.com/feast-dev/feast/blob/master/sdk/python/feast/types.py) in Python.
Type conversion logic can be found in [`type_map.py`](https://github.com/feast-dev/feast/blob/master/sdk/python/feast/type_map.py).
Expand Down Expand Up @@ -40,6 +40,23 @@ All primitive types have corresponding array (list) types:
| `Array(Bool)` | `List[bool]` | List of booleans |
| `Array(UnixTimestamp)` | `List[datetime]` | List of timestamps |

### Set Types

All primitive types (except Map) have corresponding set types for storing unique values:

| Feast Type | Python Type | Description |
|------------|-------------|-------------|
| `Set(Int32)` | `Set[int]` | Set of unique 32-bit integers |
| `Set(Int64)` | `Set[int]` | Set of unique 64-bit integers |
| `Set(Float32)` | `Set[float]` | Set of unique 32-bit floats |
| `Set(Float64)` | `Set[float]` | Set of unique 64-bit floats |
| `Set(String)` | `Set[str]` | Set of unique strings |
| `Set(Bytes)` | `Set[bytes]` | Set of unique binary data |
| `Set(Bool)` | `Set[bool]` | Set of unique booleans |
| `Set(UnixTimestamp)` | `Set[datetime]` | Set of unique timestamps |

**Note:** Set types automatically remove duplicate values. When converting from lists or other iterables to sets, duplicates are eliminated.

### Map Types

Map types allow storing dictionary-like data structures:
Expand All @@ -60,7 +77,7 @@ from datetime import timedelta
from feast import Entity, FeatureView, Field, FileSource
from feast.types import (
Int32, Int64, Float32, Float64, String, Bytes, Bool, UnixTimestamp,
Array, Map
Array, Set, Map
)

# Define a data source
Expand Down Expand Up @@ -101,6 +118,12 @@ user_features = FeatureView(
Field(name="notification_settings", dtype=Array(Bool)),
Field(name="login_timestamps", dtype=Array(UnixTimestamp)),

# Set types (unique values only)
Field(name="visited_pages", dtype=Set(String)),
Field(name="unique_categories", dtype=Set(Int32)),
Field(name="tag_ids", dtype=Set(Int64)),
Field(name="preferred_languages", dtype=Set(String)),

# Map types
Field(name="user_preferences", dtype=Map),
Field(name="metadata", dtype=Map),
Expand All @@ -110,6 +133,24 @@ user_features = FeatureView(
)
```

### Set Type Usage Examples

Sets store unique values and automatically remove duplicates:

```python
# Simple set
visited_pages = {"home", "products", "checkout", "products"} # "products" appears twice
# Feast will store this as: {"home", "products", "checkout"}

# Integer set
unique_categories = {1, 2, 3, 2, 1} # duplicates will be removed
# Feast will store this as: {1, 2, 3}

# Converting a list with duplicates to a set
tag_list = [100, 200, 300, 100, 200]
tag_ids = set(tag_list) # {100, 200, 300}
```

### Map Type Usage Examples

Maps can store complex nested data structures:
Expand Down
44 changes: 44 additions & 0 deletions protos/feast/types/Value.proto
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,14 @@ message ValueType {
NULL = 19;
MAP = 20;
MAP_LIST = 21;
BYTES_SET = 22;
STRING_SET = 23;
INT32_SET = 24;
INT64_SET = 25;
DOUBLE_SET = 26;
FLOAT_SET = 27;
BOOL_SET = 28;
UNIX_TIMESTAMP_SET = 29;
}
}

Expand Down Expand Up @@ -72,6 +80,14 @@ message Value {
Null null_val = 19;
Map map_val = 20;
MapList map_list_val = 21;
BytesSet bytes_set_val = 22;
StringSet string_set_val = 23;
Int32Set int32_set_val = 24;
Int64Set int64_set_val = 25;
DoubleSet double_set_val = 26;
FloatSet float_set_val = 27;
BoolSet bool_set_val = 28;
Int64Set unix_timestamp_set_val = 29;
}
}

Expand Down Expand Up @@ -107,6 +123,34 @@ message BoolList {
repeated bool val = 1;
}

message BytesSet {
repeated bytes val = 1;
}

message StringSet {
repeated string val = 1;
}

message Int32Set {
repeated int32 val = 1;
}

message Int64Set {
repeated int64 val = 1;
}

message DoubleSet {
repeated double val = 1;
}

message FloatSet {
repeated float val = 1;
}

message BoolSet {
repeated bool val = 1;
}

message Map {
map<string, Value> val = 1;
}
Expand Down
72 changes: 43 additions & 29 deletions sdk/python/feast/protos/feast/types/Value_pb2.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading