-
Notifications
You must be signed in to change notification settings - Fork 125
(brwells) Add temporary forecast datasets #8504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| @@ -0,0 +1,16 @@ | |||
| friendly_name: Forecasts | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after syncing w/ @brendanwells-moz, i recommended going ahead and creating a dataset for forecasts vs a temp one. my feeling is that we will end up w/ a dataset for forecasts shared across products + forecasts_restricted (or similar) - if that doesn't pan out we can re-name this dataset, but this seemed preferable to data science. other perspectives are welcome :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is something we should discuss via a proposal to make sure we have all context around forecasting and plans for forecasting at least in the near future.
Renaming is a bit more complicated in that it would require creating a new dataset, moving all the resources into it and then deleting the old ones. For this reason, I think we should consider this a temporary space until we have a more formal decision through a proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough! You didn't ask for it explicitly, but out of an abundance of caution I'll prefix both new datasets with tmp_.
…workgroup subgroup (defined in cloudops-infra). This allows the Outerbounds default perimeter service account to write data to tables in this dataset.
|
Okay @brendanwells-moz, I have added a commit to this PR to specify the workgroup subgroup that should receive edit permissions for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment. This looks good to me; I added the lines that grant permissions to the requisite service account.
Nota Bene: I'm deferring to the team that uses and updates forecasting tables to decide how to organize those tables.
This comment has been minimized.
This comment has been minimized.
| - role: roles/bigquery.dataViewer | ||
| members: | ||
| - workgroup:mozilla-confidential | ||
| syndication: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| syndication: {} |
| dataset_base_acl: view | ||
| user_facing: true | ||
| labels: {} | ||
| default_table_workgroup_access: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think default_table_workgroup_access should be included here. workgroup_access should already define all permissions that will be inherited by all queries / datasets / tables defined within this namespace unless explicitly overwritten in their own metadata files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was auto-generated by the bqetl command, FYI.
I'll hold off on changing anything pending the check below.
| members: | ||
| - workgroup:dataops-managed/external-outerbounds-task-default | ||
|
|
||
| syndication: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| syndication: {} |
| dataset_base_acl: derived | ||
| user_facing: false | ||
| labels: {} | ||
| default_table_workgroup_access: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scholtzan if I remember correctly we should not be specifying default_table_workgroup_access, this is something the metadata generation / update command does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove default_table_workgroup_access here:
bigquery-etl/bigquery_etl/metadata/parse_metadata.py
Lines 20 to 23 in 164f88c
| DEFAULT_WORKGROUP_ACCESS = [ | |
| dict(role="roles/bigquery.dataViewer", members=["workgroup:mozilla-confidential"]) | |
| ] | |
| DEFAULT_TABLE_WORKGROUP_ACCESS = DEFAULT_WORKGROUP_ACCESS |
kik-kik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few notes, we also need to make sure the tool is given read access to forecasts_derived dataset.
Also, I still think we need to have a wider agreement on how we want forecasting to fit into the bigger picture.
|
Made the changes @kik-kik pointed out, with the exception of the |
This comment has been minimized.
This comment has been minimized.
|
I apologize that |
Integration report for "Remove forecast, which I thought already happened"
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not mean we necessarily have to call this tmp_ it was more a note for ourselves to keep this in mind when using it.
| dataset_base_acl: view | ||
| user_facing: true | ||
| labels: {} | ||
| default_table_workgroup_access: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove default_table_workgroup_access here:
bigquery-etl/bigquery_etl/metadata/parse_metadata.py
Lines 20 to 23 in 164f88c
| DEFAULT_WORKGROUP_ACCESS = [ | |
| dict(role="roles/bigquery.dataViewer", members=["workgroup:mozilla-confidential"]) | |
| ] | |
| DEFAULT_TABLE_WORKGROUP_ACCESS = DEFAULT_WORKGROUP_ACCESS |
kik-kik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r+wc
Description
The PR adds datasets for a forecasting table and a view. Officially, these are temporary datasets pending a conversation between the PDS and AE teams about naming conventions for forecasting output tables. However, we are naming them generically in the hope that we have guessed the final naming convention correctly and these won't have to be migrated.
There is no ticket for this PR, but it relates to the discussion today in the PDS x AE meeting, documented here: https://docs.google.com/document/d/1DL_Nr1e-7YC05F2ssdplcH2cmsjmtzzFWyPKbS2rgqs/edit?tab=t.0#heading=h.4itj0ra86mg6 . The original project proposal is found here: https://docs.google.com/document/d/15RNNhlcE9oj3GLlCWns6DOUWHKjZpnaEbz7OiAQZi_E/edit?tab=t.0 .
Reviewer, please follow this checklist