[ci] Updating `retrieve_bucket_info` to pull artifacts from correct bucket based on run ID #2357

geomin12 · 2025-12-01T16:43:22Z

User reported issue when trying example from https://github.com/ROCm/TheRock/blob/main/RELEASES.md#installing-tarballs-using-install_rocm_from_artifactspy due to old run ID no longer accessing new s3 bucket

Tested locally:

Newer run id: https://github.com/ROCm/TheRock/actions/runs/19688861356/job/56406062109

Retrieved bucket info:
  external_repo:
  bucket       : therock-ci-artifacts
Retrieving S3 artifacts for 19688861356 in 'therock-ci-artifacts' at '19688861356-linux'

Filtered artifacts to download:
  therock-ci-artifacts:19688861356-linux/amd-llvm_dbg_generic.tar.xz
  therock-ci-artifacts:19688861356-linux/amd-llvm_dev_generic.tar.xz
  therock-ci-artifacts:19688861356-linux/amd-llvm_doc_generic.tar.xz
  therock-ci-artifacts:19688861356-linux/amd-llvm_lib_generic.tar.xz

Older run id: https://github.com/ROCm/TheRock/actions/runs/18740768335

Retrieved bucket info:
  external_repo:
  bucket       : therock-artifacts
Retrieving S3 artifacts for 18740768335 in 'therock-artifacts' at '18740768335-linux'

Filtered artifacts to download:
  therock-artifacts:18740768335-linux/amd-llvm_lib_generic.tar.xz
  therock-artifacts:18740768335-linux/amd-llvm_run_generic.tar.xz
  therock-artifacts:18740768335-linux/base_lib_generic.tar.xz

Closes #2295

kiran-thumma · 2025-12-02T14:20:59Z

build_tools/github_actions/tests/github_actions_utils_test.py

+    )
+    def test_retrieve_newer_bucket_info(self):
+        # https://github.com/ROCm/TheRock/actions/runs/19680190301
+        external_repo, bucket = retrieve_bucket_info("ROCm/TheRock", "19680190301")


Can you explain why are we adding a hardcoded runid. Can you clarify ?

those are unit tests :)

So for Unit tests do we have to hardcode the runid of an action ?

yes, in this case, we are pulling GH data with "retrieve_bucket_info" and running asserts to ensure it works

For the changes I made, we need to check whether it's an old bucket or new bucket, and this is done via run ID

We could also mock the network requests, in which case the run IDs could be arbitrary numbers. I started on that in https://github.com/ScottTodd/TheRock/tree/gha-test-mock.

I do like at least having the ability to test against the real github API with actual data though.

Yes, we have to use something dynamic instead of hardcode run IDs. Please check if there any methods to pull the run ID dynamically.

For pulling run IDs dynamically, it's fairly easy to do this for new jobs. for old jobs, this would be quite an effort on GH APIs and for unit tests.

Since these run IDs are known and usable numbers, I'm going to keep these static for unit testing

Can we add a TODO comment on that.

kiran-thumma · 2025-12-02T14:23:24Z

build_tools/github_actions/tests/github_actions_utils_test.py

        "GITHUB_TOKEN not set, skipping test that requires GitHub API access",
    )
    def test_retrieve_bucket_info_from_fork(self):
        # https://github.com/ROCm/TheRock/actions/runs/18023442478?pr=1596


If we can remove this commented lines and add some docs as comments for This function will be helpful to understand.

there are docs in retrieve_bucket_info, while this function is a unit test. let me know if more clarification is needed

Can you add comments where is the retreive_bucket_info. That might help who are looking at the code.

Since this is a unit test suite for github_action_utils and it is testing various functions from github_action_utils, I hope that is clear enough? similar to gha_query_workflow_run_information function

although if still confusing, i can add a comment if needed

kiran-thumma · 2025-12-02T14:24:30Z

build_tools/github_actions/tests/github_actions_utils_test.py

+        os.getenv("GITHUB_TOKEN"),
+        "GITHUB_TOKEN not set, skipping test that requires GitHub API access",
+    )
+    def test_retrieve_newer_bucket_info_from_rocm_libraries(self):


Same as above can you add some docs in comments explaining the function.

there are docs in retrieve_bucket_info, while this function is a unit test. let me know if more clarification is needed

kiran-thumma · 2025-12-02T16:54:20Z

build_tools/github_actions/tests/github_actions_utils_test.py

+    )
+    def test_retrieve_newer_bucket_info(self):
+        # https://github.com/ROCm/TheRock/actions/runs/19680190301
+        external_repo, bucket = retrieve_bucket_info("ROCm/TheRock", "19680190301")


Yes, we have to use something dynamic instead of hardcode run IDs. Please check if there any methods to pull the run ID dynamically.

kiran-thumma · 2025-12-02T16:55:38Z

build_tools/github_actions/tests/github_actions_utils_test.py

        external_repo, bucket = retrieve_bucket_info("ROCm/TheRock", "18023442478")
        self.assertEqual(external_repo, "ROCm-TheRock/")
-        self.assertEqual(bucket, "therock-ci-artifacts-external")
+        self.assertEqual(bucket, "therock-artifacts-external")


Can we add this bucket dynamically by using other variables instead of hardcode the bucket name.

the "dynamic retrieval" is actually the bucket variable. the "therock-artifacts-external" string is what the variable is comparing too for unit tests correctness!

Adding fix for older run ids

d70cb92

geomin12 requested review from ScottTodd, idass1990, jayhawk-commits and jharryma December 1, 2025 16:43

github-project-automation bot added this to TheRock Triage Dec 1, 2025

github-project-automation bot moved this to TODO in TheRock Triage Dec 1, 2025

geomin12 removed the request for review from ScottTodd December 1, 2025 16:43

geomin12 added 2 commits December 1, 2025 09:01

fixing unit tests

7840e31

Merge branch 'main' into users/geomin12/date-check-run-id

402723e

kiran-thumma reviewed Dec 2, 2025

View reviewed changes

geomin12 requested a review from kiran-thumma December 2, 2025 14:34

kiran-thumma reviewed Dec 2, 2025

View reviewed changes

geomin12 requested a review from kiran-thumma December 2, 2025 17:22

[ci] Updating retrieve_bucket_info to pull artifacts from correct bucket based on run ID #2357

Are you sure you want to change the base?

[ci] Updating retrieve_bucket_info to pull artifacts from correct bucket based on run ID #2357

Conversation

geomin12 commented Dec 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[ci] Updating `retrieve_bucket_info` to pull artifacts from correct bucket based on run ID #2357

[ci] Updating `retrieve_bucket_info` to pull artifacts from correct bucket based on run ID #2357