AWS S3 sync --delete, removed new files in local
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
AWS S3 (Simple Storage Service) provides a web storage service allowing users to store and retrieve any amount of data at any time from anywhere on the web. One of the powerful features of S3 is the ability to synchronize files between a local filesystem and a remote S3 bucket using the aws s3 sync
command.
However, a specific parameter, --delete
, can lead to unexpected deletions of files, especially when used without fully understanding its functionality. This article explores how aws s3 sync --delete
operates, the implications for local file systems, and how to avoid unintentional data losses.
AWS S3 Sync Basics
The aws s3 sync
command is used to synchronize directories and S3 prefixes. It recursively copies new and updated files from the source directory to the destination, making it an efficient way to keep local and S3 storage in harmony. By default, sync
only copies files that differ between the source and destination.
Basic Syntax
- ``
<source>``: The directory or S3 prefix to sync from. - ``
<destination>``: The directory or S3 prefix to sync to. - File Existence: If a file exists in the source but not in the destination, it gets copied.
- File Changes: If a file in the source has a different size or a more recent last-modified date than a corresponding file in the destination, it gets copied.
- **File Deletion with
--delete**: If a file exists in the destination but not in the source, it gets deleted. - Full Backups: When ensuring a complete backup,
--deletecan be beneficial in cleaning up files that are no longer needed. - Selective Synchronization: Without careful filtering (using
--excludeor--include),--deletemight remove files that should remain. - Order Sensitivity: Sync direction matters. Syncing from S3 to local with
--deletecan differ greatly from syncing local to S3. - **Test with
--dryrun**: Always start with--dryrunto see what changes would be made without actually performing them. - Backup Critical Data: Before running the sync command, ensure important data is backed up.
- Understand the Source-Destination Perspective: Be clear on which direction the sync is running and what the source contains.

