(brwells) Add temporary forecast datasets #8504

brendanwells-moz · 2025-11-25T21:49:55Z

Description

The PR adds datasets for a forecasting table and a view. Officially, these are temporary datasets pending a conversation between the PDS and AE teams about naming conventions for forecasting output tables. However, we are naming them generically in the hope that we have guessed the final naming convention correctly and these won't have to be migrated.

There is no ticket for this PR, but it relates to the discussion today in the PDS x AE meeting, documented here: https://docs.google.com/document/d/1DL_Nr1e-7YC05F2ssdplcH2cmsjmtzzFWyPKbS2rgqs/edit?tab=t.0#heading=h.4itj0ra86mg6 . The original project proposal is found here: https://docs.google.com/document/d/15RNNhlcE9oj3GLlCWns6DOUWHKjZpnaEbz7OiAQZi_E/edit?tab=t.0 .

Reviewer, please follow this checklist

…ription

chelseybeck · 2025-11-25T22:07:54Z

sql/moz-fx-data-shared-prod/forecasts/dataset_metadata.yaml

@@ -0,0 +1,16 @@
+friendly_name: Forecasts


after syncing w/ @brendanwells-moz, i recommended going ahead and creating a dataset for forecasts vs a temp one. my feeling is that we will end up w/ a dataset for forecasts shared across products + forecasts_restricted (or similar) - if that doesn't pan out we can re-name this dataset, but this seemed preferable to data science. other perspectives are welcome :)

I think this is something we should discuss via a proposal to make sure we have all context around forecasting and plans for forecasting at least in the near future.

Renaming is a bit more complicated in that it would require creating a new dataset, moving all the resources into it and then deleting the old ones. For this reason, I think we should consider this a temporary space until we have a more formal decision through a proposal.

Fair enough! You didn't ask for it explicitly, but out of an abundance of caution I'll prefix both new datasets with tmp_.

…workgroup subgroup (defined in cloudops-infra). This allows the Outerbounds default perimeter service account to write data to tables in this dataset.

chelseatroy · 2025-11-25T22:27:40Z

Okay @brendanwells-moz, I have added a commit to this PR to specify the workgroup subgroup that should receive edit permissions for the forecasts_derived dataset.

chelseatroy

See comment. This looks good to me; I added the lines that grant permissions to the requisite service account.

Nota Bene: I'm deferring to the team that uses and updates forecasting tables to decide how to organize those tables.

sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml

kik-kik · 2025-11-26T10:18:18Z

sql/moz-fx-data-shared-prod/forecasts/dataset_metadata.yaml

+- role: roles/bigquery.dataViewer
+  members:
+  - workgroup:mozilla-confidential
+syndication: {}


Suggested change

syndication: {}

kik-kik · 2025-11-26T10:20:24Z

sql/moz-fx-data-shared-prod/forecasts/dataset_metadata.yaml

+dataset_base_acl: view
+user_facing: true
+labels: {}
+default_table_workgroup_access:


I don't think default_table_workgroup_access should be included here. workgroup_access should already define all permissions that will be inherited by all queries / datasets / tables defined within this namespace unless explicitly overwritten in their own metadata files.

This was auto-generated by the bqetl command, FYI.

I'll hold off on changing anything pending the check below.

sql/moz-fx-data-shared-prod/forecasts/dataset_metadata.yaml

sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml

kik-kik · 2025-11-26T10:26:47Z

sql/moz-fx-data-shared-prod/forecasts_derived/dataset_metadata.yaml

+  members:
+  - workgroup:dataops-managed/external-outerbounds-task-default
+
+syndication: {}


Suggested change

syndication: {}

kik-kik · 2025-11-26T10:27:27Z

sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml

+dataset_base_acl: derived
+user_facing: false
+labels: {}
+default_table_workgroup_access:


@scholtzan if I remember correctly we should not be specifying default_table_workgroup_access, this is something the metadata generation / update command does?

We can remove default_table_workgroup_access here:

bigquery-etl/bigquery_etl/metadata/parse_metadata.py

Lines 20 to 23 in 164f88c

DEFAULT_WORKGROUP_ACCESS = [

dict(role="roles/bigquery.dataViewer", members=["workgroup:mozilla-confidential"])

]

DEFAULT_TABLE_WORKGROUP_ACCESS = DEFAULT_WORKGROUP_ACCESS

kik-kik

Left a few notes, we also need to make sure the tool is given read access to forecasts_derived dataset.

Also, I still think we need to have a wider agreement on how we want forecasting to fit into the bigger picture.

chelseatroy · 2025-11-26T16:22:59Z

Made the changes @kik-kik pointed out, with the exception of the default_table_workgroup_access one (I know we're still waiting for confirmation on removing that).

brendanwells-moz · 2025-11-26T22:11:50Z

I apologize that forecasts didn't get properly renamed to tmp_forecasts, and instead git deleted the original and created the new one. I got a little lost in yaml linter "fun" and didn't properly update my commit.

dataops-ci-bot · 2025-11-26T22:19:50Z

Integration report for "Remove forecast, which I thought already happened"

`sql.diff`

Click to expand!

Only in /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod: tmp_forecasts
Only in /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod: tmp_forecasts_derived
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/tmp_forecasts/dataset_metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/tmp_forecasts/dataset_metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/tmp_forecasts/dataset_metadata.yaml	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/tmp_forecasts/dataset_metadata.yaml	2025-11-26 22:10:54.000000000 +0000
@@ -0,0 +1,16 @@
+friendly_name: Temporary Forecasts
+description: |-
+  Temporary dataset pending a conversation on naming conventions. Use at your own risk.
+  Views on forecasts based on Firefox data and external models. Forecast views may also live in other datasets related to the products they support.
+dataset_base_acl: view
+user_facing: true
+labels: {}
+default_table_workgroup_access:
+- role: roles/bigquery.dataViewer
+  members:
+  - workgroup:mozilla-confidential
+default_table_expiration_ms: null
+workgroup_access:
+- role: roles/bigquery.dataViewer
+  members:
+  - workgroup:mozilla-confidential
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml	2025-11-26 22:10:54.000000000 +0000
@@ -0,0 +1,21 @@
+friendly_name: Temporary Forecasts Derived
+description: |-
+  Temporary dataset pending a conversation on naming conventions. Use at your own risk.
+  Forecasts based on Firefox data and external models, plus related secondary tables. Writable by the Outerbounds default perimeter.
+  This dataset combines forecasts across products; forecasts may also live in other datasets related to the products they support.
+dataset_base_acl: derived
+user_facing: false
+labels: {}
+default_table_workgroup_access:
+- role: roles/bigquery.dataViewer
+  members:
+  - workgroup:mozilla-confidential
+default_table_expiration_ms: null
+workgroup_access:
+- role: roles/bigquery.dataViewer
+  members:
+  - workgroup:mozilla-confidential
+  - workgroup:dataops-managed/external-outerbounds-task-default
+- role: roles/bigquery.dataEditor
+  members:
+  - workgroup:dataops-managed/external-outerbounds-task-default

Link to full diff

kik-kik · 2025-11-27T11:05:07Z

sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml

I did not mean we necessarily have to call this tmp_ it was more a note for ourselves to keep this in mind when using it.

kik-kik · 2025-11-27T11:07:02Z

sql/moz-fx-data-shared-prod/tmp_forecasts/dataset_metadata.yaml

+dataset_base_acl: view
+user_facing: true
+labels: {}
+default_table_workgroup_access:


We can remove default_table_workgroup_access here:

bigquery-etl/bigquery_etl/metadata/parse_metadata.py

Lines 20 to 23 in 164f88c

DEFAULT_WORKGROUP_ACCESS = [

dict(role="roles/bigquery.dataViewer", members=["workgroup:mozilla-confidential"])

]

DEFAULT_TABLE_WORKGROUP_ACCESS = DEFAULT_WORKGROUP_ACCESS

kik-kik

r+wc

brendanwells-moz added 2 commits November 25, 2025 13:33

Add forecast and forecast_derived datasets, mozaic_daily table and view

731a607

Remove table and view (too early for them). Add proposal link to desc…

8b38e37

…ription

brendanwells-moz requested a review from a team as a code owner November 25, 2025 21:49

Merge branch 'main' into brwells-add-temp-forecast-datasets

bac29f7

This comment has been minimized.

Sign in to view

brendanwells-moz requested review from chelseatroy, chelseybeck and kik-kik November 25, 2025 22:04

brendanwells-moz self-assigned this Nov 25, 2025

chelseybeck reviewed Nov 25, 2025

View reviewed changes

Add dataEditor permissions for the external-outerbounds-task-default …

0ee4248

…workgroup subgroup (defined in cloudops-infra). This allows the Outerbounds default perimeter service account to write data to tables in this dataset.

chelseatroy approved these changes Nov 25, 2025

View reviewed changes

Merge branch 'main' into brwells-add-temp-forecast-datasets

a46010d

This comment has been minimized.

Sign in to view

chelseybeck reviewed Nov 25, 2025

View reviewed changes

sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml Show resolved Hide resolved

kik-kik reviewed Nov 26, 2025

View reviewed changes

sql/moz-fx-data-shared-prod/forecasts/dataset_metadata.yaml Outdated Show resolved Hide resolved

kik-kik reviewed Nov 26, 2025

View reviewed changes

sql/moz-fx-data-shared-prod/tmp_forecasts_derived/dataset_metadata.yaml Show resolved Hide resolved

kik-kik reviewed Nov 26, 2025

View reviewed changes

kik-kik requested changes Nov 26, 2025

View reviewed changes

chelseatroy and others added 2 commits November 26, 2025 10:20

Add read access for OB SA

bcfac5b

Merge branch 'main' into brwells-add-temp-forecast-datasets

3abf8c6

This comment has been minimized.

Sign in to view

brendanwells-moz added 2 commits November 26, 2025 13:00

Merge branch 'main' into brwells-add-temp-forecast-datasets

1bb8666

Change datasets to tmp versions

2eba891

Remove forecast, which I thought already happened

90cfb22

kik-kik reviewed Nov 27, 2025

View reviewed changes

kik-kik approved these changes Nov 27, 2025

View reviewed changes

	DEFAULT_WORKGROUP_ACCESS = [
	dict(role="roles/bigquery.dataViewer", members=["workgroup:mozilla-confidential"])
	]
	DEFAULT_TABLE_WORKGROUP_ACCESS = DEFAULT_WORKGROUP_ACCESS

(brwells) Add temporary forecast datasets #8504

Are you sure you want to change the base?

(brwells) Add temporary forecast datasets #8504

Conversation

brendanwells-moz commented Nov 25, 2025

Description

Uh oh!

This comment has been minimized.

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chelseatroy commented Nov 25, 2025

Uh oh!

chelseatroy left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kik-kik left a comment

Choose a reason for hiding this comment

Uh oh!

chelseatroy commented Nov 26, 2025

Uh oh!

This comment has been minimized.

brendanwells-moz commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dataops-ci-bot commented Nov 26, 2025

Integration report for "Remove forecast, which I thought already happened"

sql.diff

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kik-kik left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

chelseatroy left a comment •

edited

Loading

brendanwells-moz commented Nov 26, 2025 •

edited

Loading

`sql.diff`