Conversation

shobsi

BEGIN_COMMIT_OVERRIDE
feat: limited support of lambdas in Series.apply (#345)
END_COMMIT_OVERRIDE

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated https://screenshot.googleplex.com/6ZEiKXPz8LWMTRf

Partially fixes internal issue 295964341 🦕

@shobsishobsi requested review from a team as code owners January 24, 2024 02:40
@product-auto-labelproduct-auto-label bot added size: sPull request size is small.api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API.labels Jan 24, 2024
@product-auto-labelproduct-auto-label bot added size: mPull request size is medium.and removed size: sPull request size is small.labels Jan 25, 2024
Comment on lines 1155 to 1159
There is a limited support of simple functions and lambdas which can be
operated directly (without converting into a `remote_function`) on the
BigQuery DataFrames objects. This approach takes advantage of a nuance
in the way BigQuery DataFrames objects are modeled internally and works
only if the function body contains only arithmatic or logical operators.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we rephrase this as "ufunc" emulation support, as defined in the pandas docs. https://pandas.pydata.org/docs/reference/api/pandas.Series.apply.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, PTAL.

)

if not hasattr(func, "bigframes_remote_function"):
return func(self)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's catch exceptions here and if there's a "message" attribute, append a suggestion to try remote_function, instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

# supported on a Series. Let's guide the customer to use a
# remote function instead
if hasattr(ex, "message"):
ex.message += "\n{_remote_function_recommendation_message}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be an f string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching, corrected.


if not hasattr(func, "bigframes_remote_function"):
try:
return func(self)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it may result in incorrect values if this isn't a true vectorized function, let's check for by_row=False. If by_row="compat" (default) then raise and suggest either remote_function or by_row=False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, PTAL.

@product-auto-labelproduct-auto-label bot added size: lPull request size is large.and removed size: mPull request size is medium.labels Feb 9, 2024
# It is not a remote function
# Then it must be a vectorized function that applies to the Series
# as a whole
assert (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AssertionError is a bit of an odd one to raise. Usually that means some invariant that we never expect to happen has violated. Please raise ValueError instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, PTAL, thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it, thanks!

@shobsishobsi added the automergeMerge the pull request once unit tests and other checks pass.label Feb 12, 2024
@gcf-merge-on-greengcf-merge-on-green bot merged commit 208e081 into main Feb 12, 2024
@gcf-merge-on-greengcf-merge-on-green bot deleted the shobs-allow-lambdas branch February 12, 2024 23:16
@gcf-merge-on-greengcf-merge-on-green bot removed the automergeMerge the pull request once unit tests and other checks pass.label Feb 12, 2024
@release-pleaserelease-please bot mentioned this pull request Feb 12, 2024
Sign up for free to join this conversation on . Already have an account? Sign in to comment
api: bigqueryIssues related to the googleapis/python-bigquery-dataframes API.Pull request size is large.
None yet

Successfully merging this pull request may close these issues.