Skip to content

Conversation

@rileykarson
Copy link
Member

@rileykarson rileykarson commented Dec 22, 2025

Fixes hashicorp/terraform-provider-google#25343

Sorry- this ended up a fairly substantial refactoring! The logic in this method has ended up pretty complex and this attempts to prune things a bit. Please read the code a few times over, it's chunky. I'm debating a bit if I should change more to make things clearer, but cut things off around the 60-70% mark to minimise (but not remove entirely) externalities like specific error or short circuit conditions.

The core issue was that Anywhere Caches require additional permissions not needed previously to interact with the GCS bucket or objects. When we first added the Anywhere Cache check we'd replicated a behaviour with objects where we'd listed the objects in the bucket all the time & failed if any were present but force_destroy was false. This made all deletes require storage.anywhereCaches.list. After #13730 this was fixed so that non-force deletes would succeed, but a principal missing that permission would short-circuit out of the force-destroy code altogether.

This left an issue when the principal did not have storage.anywhereCaches.list, there were no caches on the bucket, but there were objects present, as we'd short-circuit before destroying any objects and fail the final delete call due to objects being present. After this change we'll process both kinds (objects, caches) separately and should succeed in that case.

Old logic:

  1. list objects (1 page), short circuit on failure
  2. list anywhere caches (1 page), short circuit on failure
  3. if both empty, short circuit
  4. if any objects in page have not met the retention policy, error
  5. if force_destroy is false, error (there are implicitly either caches or objects present)
  6. delete anywhere caches
  7. delete objects
  8. go back to 1 (typically exiting due to condition 3)

New logic:

  1. list anywhere caches (1 page)
  2. if caches present and force_destroy is false, error
  3. delete anywhere caches
  4. list objects (1 page), short circuit on failure
  5. if objects empty, short circuit
  6. if any objects in page have not met the retention policy, error
  7. if force_destroy is false, error (there are implicitly objects present)
  8. delete objects
  9. go back to 4 (typically exiting due to condition 5)

This does mean we won't walk additional pages of anywhere caches- I'm not sure how needed that is, and it wouldn't happen unless there were multiple pages of objects before.

There are a few paths to take further refactoring- like failing as early as possible (check all objects, all retention policies, etc. and only proceed at all if we think we will succeed) or as late as possible (just iterate serially until we fail, no single page preflight checks for things). Declining to do them for now, but open to thoughts/suggestions.

Release Note Template for Downstream PRs (will be copied)

See Write release notes for guidance.

storage: fixed `google_storage_bucket` behavior when `force_destroy` is set to `true`. Previously, failing to list anywhere caches would prevent destroying objects on the bucket. Now, both objects and caches are processed independently.

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

google provider: Diff ( 2 files changed, 58 insertions(+), 61 deletions(-))
google-beta provider: Diff ( 2 files changed, 58 insertions(+), 61 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 128
Passed tests: 118
Skipped tests: 10
Affected tests: 0

Click here to see the affected service packages
  • storage

🟢 All tests passed!

View the build log

Copy link
Member

@c2thorn c2thorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's messy to follow, but I think the logic looks good.
I think what you have now is a nice medium between failing early/late as possible, and don't think it's necessary to go more either way.

Maybe later we could extract out the anywhere cache deletions and object deletions from the delete function to have the logic be simplified a bit, but I think it's not needed here for this bug fix.

I know testing this scenario via acceptance or unit tests is probably overly painful, but were you able to manually verify any scenarios here?

@rileykarson
Copy link
Member Author

Yep! Just walled on Anywhere Cache creation times a bit when I sent this, here's a gist: https://gist.github.com/rileykarson/de7d37fd6f3ba5e2db1c1e28b99a80fe

In all cases, bucket has an object and my runner SA isn't permissioned to interact with anywhere caches:

  • Bucket has an anywhere cache
    • Before this change I fail to interact with the bucket and get back ("The bucket you tried to delete is not empty.") due to not deleting any objects
    • After this change I clear the objects and fail on "The bucket you tried to delete has Anywhere Caches."
  • Bucket has no anywhere cache
    • Before this change I fail to interact with the bucket and get back ("The bucket you tried to delete is not empty.") due to not deleting any objects
    • After this change I successfully clear the objects and delete the bucket

@rileykarson
Copy link
Member Author

Note, waiting on service review from @kautikdk to merge. It's a weird schedule due to the holidays so I'll wait a bit longer than the standard time and do on (or after) noon PST Dec 29 with no ack.

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

google provider: Diff ( 2 files changed, 64 insertions(+), 61 deletions(-))
google-beta provider: Diff ( 2 files changed, 64 insertions(+), 61 deletions(-))

@rileykarson
Copy link
Member Author

^ thought I'd amended that bit in, #16017 (comment) assumed it was present

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 128
Passed tests: 118
Skipped tests: 10
Affected tests: 0

Click here to see the affected service packages
  • storage

🟢 All tests passed!

View the build log

@kautikdk
Copy link
Member

Hi @rileykarson, I am aligned with your proposed solution, which implements a best-effort cleanup—neither too harsh nor too lenient. I recall force_destroy had complications previously, and we are still awaiting a decision on other conditions like retention, object locks/holds, etc. (Reference: yaqs/2092514663680966656).

For now, I believe we can proceed with this solution as it restores the ability to delete objects even if Anywhere Cache permissions are missing, keeping the cleanups independent.

FYI: @googlyrahman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failure to force_destroy bucket unless there is storage.anywhereCaches.list permission

4 participants