Download file from AWS S3 using Python
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
The standard way to download files from Amazon S3 using Python is the Boto3 library, the official AWS SDK for Python. Boto3 provides two interfaces — a high-level resource API with download_file() and download_fileobj(), and a low-level client API with the same methods. For most use cases, s3.download_file(bucket, key, local_path) is the simplest approach. For large files, Boto3 automatically uses multipart downloads to speed up the transfer.
Prerequisites
Credentials can also be set via environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) or an IAM role attached to an EC2 instance.
Basic Download with Client
download_file() handles the entire transfer, including multipart downloads for large files. The Key is the full S3 object path (not including the bucket name).
Download with Resource API
The resource API provides a more object-oriented interface. Both approaches call the same underlying S3 API.
Download to Memory (BytesIO)
download_fileobj() writes to any file-like object. get_object() returns a streaming response body that you read directly.
Download with Progress Callback
The Callback parameter is called with the number of bytes transferred in each chunk, allowing you to display progress.
Download Multiple Files
Use list_objects_v2 with a paginator to handle prefixes with more than 1,000 objects.
Download with Custom Configuration
TransferConfig controls multipart thresholds and concurrency. Increasing max_concurrency speeds up large file downloads on fast networks.
Generate Presigned URL (No SDK Needed on Client)
Presigned URLs let users download files without AWS credentials. Useful for sharing files temporarily or serving downloads from a web application.
Common Pitfalls
- Forgetting to create local directories:
download_file()does not create parent directories. If the local path is/tmp/data/report.csvand/tmp/data/does not exist, the call raisesFileNotFoundError. Useos.makedirs(os.path.dirname(path), exist_ok=True). - Confusing
Keywith the full S3 URI: TheKeyparameter is just the object path within the bucket (e.g.,data/file.csv), nots3://bucket/data/file.csv. Do not include the bucket name or thes3://prefix in the key. - Not handling
ClientErrorfor missing objects: Downloading a non-existent key raisesbotocore.exceptions.ClientErrorwith error code404. Wrap downloads in a try/except block or check existence withhead_objectfirst. - Using
get_objectfor large files:get_object()loads the response into memory via.read(). For files larger than available RAM, usedownload_file()ordownload_fileobj()which stream to disk in chunks. - Assuming credentials are always in
~/.aws/credentials: On EC2 instances, ECS tasks, or Lambda functions, credentials come from IAM roles — no credentials file exists. Boto3 checks the credential chain automatically (env vars, config file, IAM role, etc.).
Summary
- Use
s3.download_file(bucket, key, local_path)for the simplest file download - Use
download_fileobj()orget_object()to download into memory or a file-like object - Boto3 automatically handles multipart downloads for large files — customize with
TransferConfig - Use a paginator with
list_objects_v2to download all files under a prefix - Generate presigned URLs for temporary, credential-free download links
- Always handle missing objects with try/except and create local directories before downloading

