Skip to content

fix(spanner): catch recursion and decode errors in proto parsing to p…#16561

Merged
sinhasubham merged 1 commit intomainfrom
proto_parsing_fix
Apr 9, 2026
Merged

fix(spanner): catch recursion and decode errors in proto parsing to p…#16561
sinhasubham merged 1 commit intomainfrom
proto_parsing_fix

Conversation

@sinhasubham
Copy link
Copy Markdown
Contributor

This PR fixes a Persistent Stored Denial of Service (DoS) vulnerability in the google-cloud-spanner Python SDK (Issue 479858035).

The Problem
When the SDK attempts to deserialize a Protobuf-encoded row (via _parse_proto() in _helpers.py) that contains a maliciously crafted "recursion bomb" (e.g., a ListValue nested 1,000+ times), it triggers a DecodeError or RecursionError. This unhandled exception crashes the consumer thread and blocks the entire result set stream ("pipeline blackhole").

The Solution
We modify _parse_proto to wrap the ParseFromString() call in a defensive try...except block:
Catch RecursionError (triggered if Python hits its stack limit first in pure Python implementations).
Catch google.protobuf.message.DecodeError (triggered by the C++ extension's internal limits).

If an error is caught: A warning is logged. The original raw bytes_value is returned as a fallback (consistent with existing behavior when no prototype is found). This allows the stream iterator to continue processing subsequent rows.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces error handling for Protobuf parsing within the _parse_proto helper, ensuring that DecodeError or RecursionError results in returning raw bytes rather than a failure. Review feedback identifies that the log variable is undefined and should be replaced with _LOGGER, along with removing redundant prefixes in the log message. Furthermore, the newly added unit tests require a fix to decode base64 encoded bytes into UTF-8 strings to avoid TypeError when populating Protobuf Value messages.

@sinhasubham sinhasubham marked this pull request as ready for review April 7, 2026 06:28
@sinhasubham sinhasubham requested review from a team as code owners April 7, 2026 06:28
proto_message.ParseFromString(bytes_value)
return proto_message
try:
proto_message.ParseFromString(bytes_value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to enforce nesting level?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard Python Protobuf API (ParseFromString) does not expose a parameter to enforce a custom recursion or nesting limit during parsing.
Although, we did consider using sys.setrecursionlimit() to force a lower limit (e.g. 100) right before parsing but had to reject it because sys.setrecursionlimit is process-wide. In a multi-threaded environment, lowering the limit for this one call could cause arbitrary valid code run by other threads to crash.

I couldn't find any other thread-safe way to enforce a custom limit during parsing in Python Protobuf, the most robust solution is to catch the errors the parser naturally throws (DecodeError and RecursionError) and fall back.

In Java, we can explicitly enforce and set a custom nesting level while parsing which is not possible in Python.

@sinhasubham sinhasubham merged commit 70dc6bf into main Apr 9, 2026
31 checks passed
@sinhasubham sinhasubham deleted the proto_parsing_fix branch April 9, 2026 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants