Skip to content

fix(sqlalchemy-spanner): handle TOKENLIST and unknown column types in introspection#16622

Open
waiho-gumloop wants to merge 1 commit intogoogleapis:mainfrom
waiho-gumloop:fix/sqlalchemy-spanner-tokenlist-type
Open

fix(sqlalchemy-spanner): handle TOKENLIST and unknown column types in introspection#16622
waiho-gumloop wants to merge 1 commit intogoogleapis:mainfrom
waiho-gumloop:fix/sqlalchemy-spanner-tokenlist-type

Conversation

@waiho-gumloop
Copy link
Copy Markdown
Contributor

@waiho-gumloop waiho-gumloop commented Apr 13, 2026

Description

SpannerDialect._designate_type() crashes with KeyError: 'TOKENLIST' when introspecting tables that have TOKENLIST columns (introduced by Spanner full-text search).

Root cause

  1. _type_map does not include TOKENLIST.
  2. _designate_type() performs a bare dict lookup with no fallback, so any unrecognized type crashes all introspection — not just the one column.

Changes

1. First-class TOKENLIST type (matching the Go client's approach)

Rather than mapping to NullType (which would lose type information and roundtrip as INT64 via _type_map_inv), this PR adds a proper TOKENLIST class extending TypeEngine:

  • Forward map: _type_map["TOKENLIST"] = TOKENLIST
  • Inverse map: _type_map_inv[TOKENLIST] = "TOKENLIST"
  • DDL compilation: SpannerTypeCompiler.visit_TOKENLIST() returns "TOKENLIST"

This ensures correct roundtripping through Alembic --autogenerate and other schema tools. It also allows TOKENLIST columns to be present in reflected models so they can be referenced in SEARCH(), SCORE(), and SNIPPET() queries via SQLAlchemy.

TOKENLIST columns are generated (Computed) and HIDDEN (excluded from SELECT *), but they are readable by name and actively used in full-text search queries. They should be reflected in models, not filtered out.

2. Unknown type fallback in _designate_type()

Added try/except KeyError that returns NullType with a warnings.warn(). Future-proofs against new Spanner types being added before the dialect is updated.

Prior art — Go client fix

The Go spansql package had the identical issue. The fix added TOKENLIST as a first-class TypeBase constant:

This Python PR mirrors that approach within the SQLAlchemy dialect's architecture.

Fixes #16621

Testing

Added tests/unit/test_types.py:

  • TestDesignateType: known types, STRING/BYTES with length, TOKENLIST → TOKENLIST type, unknown type → NullType with warning, ARRAY
  • TestTokenlistType: forward/inverse type map entries, type compiler roundtrip (TOKENLIST()"TOKENLIST"), TypeEngine subclass check

@waiho-gumloop waiho-gumloop requested a review from a team as a code owner April 13, 2026 03:33
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the TOKENLIST Spanner type by mapping it to NullType and introduces a fallback mechanism in _designate_type to handle unknown types with a warning. Unit tests were also added to cover various type mapping scenarios. A review comment identified a return type inconsistency in _designate_type, where type classes were being returned instead of instances, which would lead to test failures.

Comment on lines +1231 to +1238
try:
return _type_map[str_repr]
except KeyError:
warnings.warn(
"Did not recognize Spanner type '%s', "
"mapping it to NullType" % str_repr
)
return types.NullType()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is an inconsistency in the return types of _designate_type. When a type is found in _type_map, the code returns the type class (e.g., types.Boolean), but the fallback returns an instance (types.NullType()).

Furthermore, the unit tests added in this PR (e.g., test_known_types and test_tokenlist_maps_to_nulltype) use isinstance(result, types.Boolean), which will fail if a class is returned instead of an instance.

To ensure consistency and fix the tests, you should instantiate the type from the map. Additionally, adding stacklevel=2 to the warning is recommended so it points to the caller of the introspection.

Suggested change
try:
return _type_map[str_repr]
except KeyError:
warnings.warn(
"Did not recognize Spanner type '%s', "
"mapping it to NullType" % str_repr
)
return types.NullType()
try:
return _type_map[str_repr]()
except KeyError:
warnings.warn(
"Did not recognize Spanner type '%s', "
"mapping it to NullType" % str_repr,
stacklevel=2
)
return types.NullType()

… in introspection

Spanner full-text search uses TOKENLIST columns, but the dialect does
not recognize this type — _type_map has no entry and _designate_type()
crashes with KeyError on any unrecognized type string.

This change:
1. Adds a first-class TOKENLIST type (TypeEngine subclass) with
   forward mapping (_type_map), inverse mapping (_type_map_inv), and
   DDL compilation (SpannerTypeCompiler.visit_TOKENLIST). This ensures
   schema introspection roundtrips correctly through Alembic and
   other DDL tools, and allows TOKENLIST columns to be referenced
   in SEARCH()/SCORE()/SNIPPET() queries via SQLAlchemy.
2. Adds a fallback in _designate_type() so unrecognized types return
   NullType with a warning instead of crashing. This future-proofs
   against new Spanner types.

The Go client had the same gap and fixed it in
googleapis/google-cloud-go#11522 (released in spanner v1.78.0),
which added TOKENLIST as a first-class TypeBase.

Fixes googleapis#16621
@waiho-gumloop waiho-gumloop force-pushed the fix/sqlalchemy-spanner-tokenlist-type branch 2 times, most recently from 736e001 to c2f5788 Compare April 13, 2026 06:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sqlalchemy-spanner: _designate_type crashes with KeyError on TOKENLIST columns

1 participant