Issue
Context: I have 2 buckets, bucket A and bucket B. Bucket A had all of its contents placed in Bucket B via the aws s3 sync
CLI command.
Problem: I want to delete all the items in bucket B that also exist in Bucket A, without deleting anything in Bucket A.
E.g.
Bucket A (Source):
- File R
- File G
- File C
Bucket B (Destination):
- File A
- File R
- File G
- File C
- File O
^^ I need to delete all files in the target destination which do exist in the source destination, so only files R, G, and C need to be deleted from Bucket B.
Attempted Solution: The aws s3 sync
CLI command includes the flag --delete
. However, this flag only ensures that any files in the target destination that aren't in the source destination are deleted.
Is there any way to do this using aws s3 sync
?
Solution
I ended up solving this via s3 bucket lifecycle rules where I specified a regex pattern that matched against the necessary files in both buckets.
Using S3 lifecycle rules allows any number of files (in the millions even) to be deleted at midnight UTC and does not incur a cost, unlike the aws s3 cli
which needs to list objects in order to programmatically delete them (in batches of 1000 at a time)
Answered By - Prithvi Boinpally Answer Checked By - Marie Seifert (WPSolving Admin)