feat(bigframes): Support loading avro, orc data by TrevorBergeron · Pull Request #16555 · googleapis/google-cloud-python

TrevorBergeron · 2026-04-03T21:14:54Z

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

gemini-code-assist

Code Review

This pull request introduces support for reading ORC and Avro files into BigQuery DataFrames by implementing read_orc and read_avro methods in the Session class and providing corresponding API wrappers. Review feedback identifies a bug in the system tests where to_orc is called on a BigFrames DataFrame instead of a pandas DataFrame. Additionally, several improvements are suggested to maintain alphabetical order in imports and function definitions, along with a minor wording update for an error message to improve clarity.

gemini-code-assist · 2026-04-03T21:18:05Z

packages/bigframes/tests/system/small/test_session.py

+    )
+    df_write = df_in.reset_index(drop=False)
+    df_write.index.name = f"ordering_id_{random.randrange(1_000_000)}"
+    df_write.to_orc(write_path)


It appears bigframes.dataframe.DataFrame does not currently implement a to_orc method. Since BigQuery also does not support exporting to ORC, you likely need to convert to a pandas DataFrame first to write the test data to GCS.

Suggested change

df_write.to_orc(write_path)

df_write.to_pandas().to_orc(write_path)

gemini-code-assist · 2026-04-03T21:18:05Z

packages/bigframes/bigframes/pandas/__init__.py

+    read_orc,
+    read_avro,


The functions should be listed in alphabetical order to maintain consistency with the rest of the file. While the preceding lines are context, swapping these two will at least ensure they are sorted relative to each other.

Suggested change

read_orc,

read_avro,

read_avro,

read_orc,

gemini-code-assist · 2026-04-03T21:18:05Z

packages/bigframes/bigframes/pandas/__init__.py

+    "read_orc",
+    "read_avro",


The entries in __all__ should be in alphabetical order. Swapping these two improves consistency.

Suggested change

"read_orc",

"read_avro",

"read_avro",

"read_orc",

gemini-code-assist · 2026-04-03T21:18:05Z

packages/bigframes/bigframes/pandas/io/api.py

+def read_orc(
+    path: str | IO["bytes"],
+    *,
+    engine: str = "auto",
+    write_engine: constants.WriteEngineType = "default",
+) -> bigframes.dataframe.DataFrame:
+    return global_session.with_default_session(
+        bigframes.session.Session.read_orc,
+        path,
+        engine=engine,
+        write_engine=write_engine,
+    )
+
+
+read_orc.__doc__ = inspect.getdoc(bigframes.session.Session.read_orc)
+
+
+def read_avro(
+    path: str | IO["bytes"],
+    *,
+    engine: str = "auto",
+) -> bigframes.dataframe.DataFrame:
+    return global_session.with_default_session(
+        bigframes.session.Session.read_avro,
+        path,
+        engine=engine,
+    )
+
+
+read_avro.__doc__ = inspect.getdoc(bigframes.session.Session.read_avro)


Reorder read_avro and read_orc to maintain alphabetical order. Additionally, these should ideally be placed before read_parquet in the file to follow the alphabetical convention used in this module.

def read_avro( path: str | IO["bytes"], *, engine: str = "auto", ) -> bigframes.dataframe.DataFrame: return global_session.with_default_session( bigframes.session.Session.read_avro, path, engine=engine, ) read_avro.__doc__ = inspect.getdoc(bigframes.session.Session.read_avro) def read_orc( path: str | IO["bytes"], *, engine: str = "auto", write_engine: constants.WriteEngineType = "default", ) -> bigframes.dataframe.DataFrame: return global_session.with_default_session( bigframes.session.Session.read_orc, path, engine=engine, write_engine=write_engine, ) read_orc.__doc__ = inspect.getdoc(bigframes.session.Session.read_orc)

gemini-code-assist · 2026-04-03T21:18:06Z

packages/bigframes/bigframes/session/__init__.py

+                    "please use the 'bigquery' engine by setting `engine='bigquery'` in "
+                    "your configuration."


The phrase 'in your configuration' might be confusing as the engine is typically passed as an argument to the function call. Consider clarifying this to 'in the function call'.

Suggested change

"please use the 'bigquery' engine by setting `engine='bigquery'` in "

"your configuration."

"please use the 'bigquery' engine by setting engine='bigquery' in "

"the function call."

feat(bigframes): Support loading avro, orc data

802231f

gemini-code-assist bot reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bigframes): Support loading avro, orc data#16555

feat(bigframes): Support loading avro, orc data#16555
TrevorBergeron wants to merge 1 commit intomainfrom
tbergeron_bf_read_orc_avro

TrevorBergeron commented Apr 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Uh oh!

gemini-code-assist bot Apr 3, 2026

Uh oh!

gemini-code-assist bot Apr 3, 2026

Uh oh!

gemini-code-assist bot Apr 3, 2026

Uh oh!

gemini-code-assist bot Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	df_write.to_orc(write_path)
	df_write.to_pandas().to_orc(write_path)

		"please use the 'bigquery' engine by setting `engine='bigquery'` in "
		"your configuration."

Conversation

TrevorBergeron commented Apr 3, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant