fix: avoid policy tags 403 error in `load_table_from_dataframe` by tswast · Pull Request #557 · googleapis/python-bigquery ·

In internal issue 182204971, as customer is encountering a 403 error for missing permissions to set policy tags on a table when trying to append a dataframe to a table with load_table_from_dataframe. This is because we get the schema from the table and then pass it back to the API. We only need to set the field names (and maybe type + mode) in this case.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes internal bug 182204971 🦕

tswast · 2021-03-18T14:57:09Z

google/cloud/bigquery/schema.py

+}
+if mode is not None:
+self._properties["mode"] = mode.upper()
+if description is not _DEFAULT_VALUE:


@shollyman This is one of the key changes: we no longer set the resource value for "description" if it's not explicitly set.
We already omit policy_tags from the resource if none (though arguably it should get the same treatment so that someone can unset policy tags from Python)

My default inclination would be for special handling for None values to happen at the places where it's significant, like when calling tables.update. It's also the case that schema fields can't be manipulated individually, so perhaps I'm simply just not thinking this through properly.

I called that out as a possibility in #558, but that'd require updating our field mask logic to support sub-fields, which gets into some hairy string parsing (perhaps not all that hairy, as it could be as simple as split on '.', but is definitely a departure from what we've been doing).
Also, it might mean that we'd have to introduce a field mask to our load job methods. Based on the error message we're seeing, it sounds like it's possible to make updates to fields like policy tags from a load job.

tswast · 2021-03-18T15:02:06Z

google/cloud/bigquery/client.py

-field for field in table.schema if field.name in columns_and_indexes
+# Field description and policy tags are not needed to
+# serialize a data frame.
+SchemaField(


This is the actual bug fix. Rather than populate all properties of schema field from the table schema, just populate the minimum we need to convert to parquet/CSV and then upload

We'll need to revisit this for parameterization constraints, but that's a problem for future Tim.

Also, check that sent schema matches DataFrame order, not table order

shollyman · 2021-03-18T16:46:28Z

google/cloud/bigquery/schema.py

+}
+if mode is not None:
+self._properties["mode"] = mode.upper()
+if description is not _DEFAULT_VALUE:


My default inclination would be for special handling for None values to happen at the places where it's significant, like when calling tables.update. It's also the case that schema fields can't be manipulated individually, so perhaps I'm simply just not thinking this through properly.

shollyman · 2021-03-18T16:50:46Z

google/cloud/bigquery/client.py

-field for field in table.schema if field.name in columns_and_indexes
+# Field description and policy tags are not needed to
+# serialize a data frame.
+SchemaField(


We'll need to revisit this for parameterization constraints, but that's a problem for future Tim.

For posterity, here is the error you get if you include policyTags in the schema, even if they aren't actually changed:

[1] Reason: 403 POST
https://bigquery.googleapis.com/upload/bigquery/v2/projects/YOUR-PROJECT-ID/jobs?uploadType=resumable:
Access Denied: Taxonomy projects/REDACTED/locations/eu/taxonomies/REDACTED: 
User does not have permission to get taxonomy projects/REDACTED/locations/eu/taxonomies/REDACTED\n'

WIP: fix: don't set policy tags in load job from dataframe

1f6e6d8

product-auto-label bot added the api: bigqueryIssues related to the googleapis/python-bigquery API.label Mar 17, 2021

google-cla bot added the cla: yesThis human has signed the Contributor License Agreement.label Mar 17, 2021

tswast changed the title ~~WIP: fix: don't set policy tags in load job from dataframe~~ fix: avoid policy tags 403 error in load_table_from_dataframe Mar 17, 2021

tswast added 2 commits March 17, 2021 17:13

copy fields parameter for struct support

d4b6d32

update tests to allow missing description property

3cdbcc7

tswast marked this pull request as ready for review March 18, 2021 14:34

tswast requested review from a team, stephaniewang526, shollyman and plamut and removed request for a team March 18, 2021 14:34

tswast commented Mar 18, 2021
View reviewed changes

fix load from dataframe test on python 3.6

fe1029d
Also, check that sent schema matches DataFrame order, not table order

tswast mentioned this pull request Mar 18, 2021
add ability to unset policy tags (and description?) in schema fields #558
Closed

shollyman approved these changes Mar 18, 2021
View reviewed changes

tswast merged commit 84e646e into googleapis:master Mar 19, 2021

tswast deleted the b182204971-dataframe-policy-tags branch March 19, 2021 17:54

tswast mentioned this pull request Mar 29, 2021
fix: avoid 403 from to_gbq when table has policyTags googleapis/python-bigquery-pandas#356
Merged
4 tasks

tswast mentioned this pull request Apr 23, 2021
Implement BigQuery Table Schema Update Operator apache/airflow#15367
Merged

This was referencedSep 22, 2021
feat: enable unsetting policy tags on schema fields #703
Merged
disambiguate missing policy tags from explicitly unset policy tags #981
Closed

release-please bot mentioned this pull request Jan 4, 2022
chore(main): release python-bigquery 1.27.1 #1097
Closed

Uh oh!

Uh oh!

tswast Mar 18, 2021

Uh oh!

shollyman Mar 18, 2021

Uh oh!

tswast Mar 18, 2021

Uh oh!

tswast Mar 18, 2021

Uh oh!

shollyman Mar 18, 2021

Uh oh!

shollyman Mar 18, 2021

Uh oh!

shollyman Mar 18, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Conversation

Uh oh!

tswast Mar 18, 2021

Choose a reason for hiding this comment

Uh oh!

shollyman Mar 18, 2021

Choose a reason for hiding this comment

Uh oh!

tswast Mar 18, 2021

Choose a reason for hiding this comment

Uh oh!

tswast Mar 18, 2021

Choose a reason for hiding this comment

Uh oh!

shollyman Mar 18, 2021

Choose a reason for hiding this comment

Uh oh!

shollyman Mar 18, 2021

Choose a reason for hiding this comment

Uh oh!

shollyman Mar 18, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!