Skip to content

feat: add bigframes calling Python UDF code sample#13919

Open
tswast wants to merge 8 commits intoGoogleCloudPlatform:mainfrom
tswast:tswast-bigframes
Open

feat: add bigframes calling Python UDF code sample#13919
tswast wants to merge 8 commits intoGoogleCloudPlatform:mainfrom
tswast:tswast-bigframes

Conversation

@tswast
Copy link
Copy Markdown
Contributor

@tswast tswast commented Mar 24, 2026

This sample is intended for https://docs.cloud.google.com/bigquery/docs/user-defined-functions-python#udf_users See internal issue b/494558638.

Description

Fixes #

Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.

Checklist

@tswast tswast requested review from a team and chalmerlowe as code owners March 24, 2026 20:56
@snippet-bot
Copy link
Copy Markdown

snippet-bot bot commented Mar 24, 2026

Here is the summary of changes.

You are about to add 1 region tag.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@product-auto-label product-auto-label bot added samples Issues that are directly related to samples. api: bigquery Issues related to the BigQuery API. labels Mar 24, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new, self-contained code sample within the bigquery/bigframes directory. The primary purpose of this sample is to illustrate the process of invoking Python User-Defined Functions (UDFs) through the BigQuery DataFrames library. This addition provides a practical example that will enhance the existing documentation for BigQuery Python UDFs, offering users a clear guide on integrating UDFs with BigFrames for data manipulation.

Highlights

  • New Code Sample: Added a new code sample demonstrating how to call Python User-Defined Functions (UDFs) using BigQuery DataFrames (BigFrames).
  • Documentation Support: This sample is specifically intended to support the official BigQuery Python UDF documentation on cloud.google.com.
  • New Directory Structure: Introduced a new directory bigquery/bigframes to house BigFrames-related code samples and their configurations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@tswast tswast requested review from a team as code owners March 24, 2026 20:59
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new bigquery/bigframes directory, adding a README, a test for calling Python UDFs, pytest fixtures, nox configuration, and dependency files. The review comments suggest improving the formatting of multiline XML strings in call_python_udf_test.py for clarity, making the --project_id argument required in the test script to prevent unexpected behavior, and setting enforce_type_hints to True in noxfile_config.py to align with project standards for new samples.

Comment on lines +30 to +51
xml_series = bpd.Series(
[
'''
<book id="1">
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
</book>
''',
'''
<book id="2">
<title>1984</title>
<author>George Orwell</author>
</book>
''',
'''
<book id="3">
<title>Brave New World</title>
<author>Aldous Huxley</author>
</book>
''',
]
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The multiline string literals for xml_series contain significant leading whitespace and newlines due to Python's indentation. While the XML parser might handle this, it's best practice to provide clean data and avoid ambiguity. This also improves the readability and maintainability of the sample code.

    xml_series = bpd.Series(
        [
            """<book id="1">
    <title>The Great Gatsby</title>
    <author>F. Scott Fitzgerald</author>
</book>""",
            """<book id="2">
    <title>1984</title>
    <author>George Orwell</author>
</book>""",
            """<book id="3">
    <title>Brave New World</title>
    <author>Aldous Huxley</author>
</book>""",
        ]
    )

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. It looks better with the whitespace.

import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--project_id", type=str)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When running this test file as a standalone script, the --project_id argument is not marked as required. If a user forgets to provide it, it will default to None, which might lead to unexpected behavior or errors if a default project cannot be inferred from the environment. It's better to fail fast by making it a required argument.

Suggested change
parser.add_argument("--project_id", type=str)
parser.add_argument("--project_id", type=str, required=True)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None is OK. Then it uses the project from ADC.

tswast and others added 6 commits March 24, 2026 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the BigQuery API. samples Issues that are directly related to samples.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant