How to zip files in Amazon s3 Bucket and get its URL

Amazon S3

file compression

zipping files

cloud storage

generating URLs

How to zip files in Amazon s3 Bucket and get its URL

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Amazon S3 stores objects, but it does not provide a built-in API that takes several existing objects and returns a new zip archive for you. The usual pattern is to read the objects in application code, create the zip file in a compute environment such as Lambda, EC2, or a local script, upload the archive back to S3, and then generate a presigned URL for download.

The Basic Workflow

A typical implementation has four steps:

List the S3 objects you want to include.
Download each object stream.
Write those streams into a zip archive.
Upload the archive and generate a presigned URL.

In Python with boto3, that looks like this:

python

1import io
2import zipfile
3import boto3
4
5s3 = boto3.client("s3")
6
7source_bucket = "my-source-bucket"
8keys = ["reports/january.csv", "reports/february.csv"]
9zip_key = "archives/reports-2026-03.zip"
10
11buffer = io.BytesIO()
12
13with zipfile.ZipFile(buffer, "w", compression=zipfile.ZIP_DEFLATED) as zf:
14    for key in keys:
15        response = s3.get_object(Bucket=source_bucket, Key=key)
16        data = response["Body"].read()
17        filename = key.split("/")[-1]
18        zf.writestr(filename, data)
19
20buffer.seek(0)
21
22s3.put_object(
23    Bucket=source_bucket,
24    Key=zip_key,
25    Body=buffer.getvalue(),
26    ContentType="application/zip",
27)
28
29url = s3.generate_presigned_url(
30    "get_object",
31    Params={"Bucket": source_bucket, "Key": zip_key},
32    ExpiresIn=3600,
33)
34
35print(url)

That creates the zip in memory, uploads it, and returns a download URL valid for one hour.

Choosing Where to Run the Zip Step

The algorithm is simple, but the runtime environment matters.

Use AWS Lambda for small or medium archives and event-driven workflows.
Use ECS, EC2, or a container job for large archives or long-running jobs.
Use a local or backend server process if the archive is part of an existing application flow.

Lambda is convenient, but it has memory, runtime, and temporary-storage limits. Very large archives often fit better in a containerized worker.

Presigned URL Details

The presigned URL is only a signed temporary permission to download the uploaded zip object. It does not create the file for you. That means you must upload the archive first, then sign that object key.

If you want a friendlier download name, set ContentDisposition when uploading:

python

1s3.put_object(
2    Bucket=source_bucket,
3    Key=zip_key,
4    Body=buffer.getvalue(),
5    ContentType="application/zip",
6    ContentDisposition='attachment; filename="reports-2026-03.zip"',
7)

That hints to browsers how the file should be downloaded.

Streaming vs In-Memory Archives

The in-memory BytesIO approach is convenient, but it is not ideal for very large datasets. If the archive may become large, write to a temporary file instead of holding the whole zip in memory.

python

1import os
2import tempfile
3import zipfile
4import boto3
5
6s3 = boto3.client("s3")
7
8with tempfile.NamedTemporaryFile(delete=False, suffix=".zip") as temp_file:
9    temp_path = temp_file.name
10
11with zipfile.ZipFile(temp_path, "w", compression=zipfile.ZIP_DEFLATED) as zf:
12    response = s3.get_object(Bucket="my-source-bucket", Key="reports/january.csv")
13    zf.writestr("january.csv", response["Body"].read())
14
15with open(temp_path, "rb") as archive:
16    s3.upload_fileobj(archive, "my-source-bucket", "archives/reports.zip")
17
18os.remove(temp_path)

That pattern is more memory-efficient and often a better fit for server-side jobs.

Common Pitfalls

Assuming S3 can zip existing objects by itself without any compute layer.
Generating a presigned URL before the archive has actually been uploaded to the expected key.
Pulling large source objects into memory when the job should really use temporary files or streamed processing.
Forgetting to set the right content type or content disposition for the uploaded archive.
Choosing Lambda for archives that are too large for its memory, storage, or runtime limits.

Summary

S3 does not natively zip multiple existing objects into one archive.
Read the objects in code, create the zip, upload it, then generate a presigned URL.
'boto3 plus Python's zipfile module is a straightforward implementation path.'
Use in-memory archives only for smaller payloads.
For large archives, choose a compute environment and storage strategy that matches the file sizes.