feat: Optimize DynamoDB online store for improved latency #5889
+148
−74
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Optimizes the DynamoDB online store implementation to reduce online feature serving latency.
O(1) dictionary lookup instead of O(n log n) sorting -
_process_batch_get_responsenow uses dictionary-based lookup for response ordering instead of sorting.Cached TypeDeserializer - Added class-level cached
TypeDeserializerinstance to avoid per-request object instantiation overhead in async reads.Entity ID computation caching -
_to_entity_idsnow caches computed entity IDs within a request to avoid redundant hashing for duplicate entity keys.VPC endpoint support for async client - The
endpoint_urlconfig is now properly passed to the async aiobotocore client, enabling DynamoDB VPC endpoints for reduced network latency.Improved default configuration values:
batch_size: 40 → 100 (max allowed by DynamoDB BatchGetItem)max_pool_connections: 10 → 50 (better concurrency)keepalive_timeout: 12s → 30s (better connection reuse)connect_timeout: 60s → 5s (faster failure detection)read_timeout: 60s → 10s (faster failure detection)total_max_retry_attempts: None → 3 (bounded retries)retry_mode: None → "adaptive" (smart retry with rate limiting)Pre-allocated result lists - Response processing now pre-allocates result lists instead of using append-based growth.