How to zip files in Amazon s3 Bucket and get its URL
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Amazon S3 stores objects, but it does not provide a built-in API that takes several existing objects and returns a new zip archive for you. The usual pattern is to read the objects in application code, create the zip file in a compute environment such as Lambda, EC2, or a local script, upload the archive back to S3, and then generate a presigned URL for download.
The Basic Workflow
A typical implementation has four steps:
- List the S3 objects you want to include.
- Download each object stream.
- Write those streams into a zip archive.
- Upload the archive and generate a presigned URL.
In Python with boto3, that looks like this:
That creates the zip in memory, uploads it, and returns a download URL valid for one hour.
Choosing Where to Run the Zip Step
The algorithm is simple, but the runtime environment matters.
- Use AWS Lambda for small or medium archives and event-driven workflows.
- Use ECS, EC2, or a container job for large archives or long-running jobs.
- Use a local or backend server process if the archive is part of an existing application flow.
Lambda is convenient, but it has memory, runtime, and temporary-storage limits. Very large archives often fit better in a containerized worker.
Presigned URL Details
The presigned URL is only a signed temporary permission to download the uploaded zip object. It does not create the file for you. That means you must upload the archive first, then sign that object key.
If you want a friendlier download name, set ContentDisposition when uploading:
That hints to browsers how the file should be downloaded.
Streaming vs In-Memory Archives
The in-memory BytesIO approach is convenient, but it is not ideal for very large datasets. If the archive may become large, write to a temporary file instead of holding the whole zip in memory.
That pattern is more memory-efficient and often a better fit for server-side jobs.
Common Pitfalls
- Assuming S3 can zip existing objects by itself without any compute layer.
- Generating a presigned URL before the archive has actually been uploaded to the expected key.
- Pulling large source objects into memory when the job should really use temporary files or streamed processing.
- Forgetting to set the right content type or content disposition for the uploaded archive.
- Choosing Lambda for archives that are too large for its memory, storage, or runtime limits.
Summary
- S3 does not natively zip multiple existing objects into one archive.
- Read the objects in code, create the zip, upload it, then generate a presigned URL.
- '
boto3plus Python'szipfilemodule is a straightforward implementation path.' - Use in-memory archives only for smaller payloads.
- For large archives, choose a compute environment and storage strategy that matches the file sizes.

