boto3
AWS S3
Python
metadata
tutorial

boto3 how to create object with metadata?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

When uploading an object to Amazon S3 with Boto3, you can attach custom metadata as key-value pairs. The important detail is that metadata must be provided at upload time or during a copy operation, because you do not patch object metadata in place afterward.

Use put_object With Metadata

The most direct way to create an object with metadata is put_object.

python
1import boto3
2
3s3 = boto3.client("s3")
4
5response = s3.put_object(
6    Bucket="my-bucket",
7    Key="reports/example.txt",
8    Body=b"hello from boto3",
9    Metadata={
10        "source": "batch-job",
11        "owner": "analytics",
12    },
13    ContentType="text/plain",
14)
15
16print(response["ETag"])

The Metadata dictionary contains user-defined metadata. S3 stores those values as strings.

If your application logic starts with numbers, booleans, or timestamps, convert them to strings before upload so the intent is explicit when you read them back later.

Use upload_file for Files on Disk

If you are uploading a local file, the higher-level upload_file method is often more convenient. Metadata is passed through ExtraArgs.

python
1import boto3
2
3s3 = boto3.client("s3")
4
5s3.upload_file(
6    Filename="example.txt",
7    Bucket="my-bucket",
8    Key="uploads/example.txt",
9    ExtraArgs={
10        "Metadata": {
11            "source": "desktop-tool",
12            "team": "platform",
13        },
14        "ContentType": "text/plain",
15    },
16)

This is usually the easiest API when your data already exists as a file on disk.

It is also a good reminder that metadata and object tags are different S3 features. Metadata travels with the object headers, while tags are managed as a separate tagging set.

Read the Metadata Back

To inspect object metadata without downloading the whole object, use head_object.

python
1import boto3
2
3s3 = boto3.client("s3")
4
5info = s3.head_object(Bucket="my-bucket", Key="uploads/example.txt")
6print(info["Metadata"])
7print(info["ContentType"])

That is useful for validation, debugging, and object-processing pipelines that route work based on metadata.

For example, an ingestion worker may inspect the source or document-type metadata first and then decide which parsing path to use without downloading the full object body.

Metadata Update Means Copy

One detail that surprises many people is that S3 metadata is not edited in place. To change metadata, you typically copy the object onto itself and replace the metadata during that copy.

python
1import boto3
2
3s3 = boto3.client("s3")
4
5s3.copy_object(
6    Bucket="my-bucket",
7    Key="uploads/example.txt",
8    CopySource={"Bucket": "my-bucket", "Key": "uploads/example.txt"},
9    Metadata={"source": "reprocessed", "team": "platform"},
10    MetadataDirective="REPLACE",
11    ContentType="text/plain",
12)

Without MetadataDirective="REPLACE", S3 will not use the new metadata you intended to set.

Keep Metadata Small and Intentional

Metadata is useful for lightweight object context, such as source, owner, document type, or processing state. It is not a substitute for a database record or a complex search index.

Good metadata is:

  • short,
  • string-based,
  • easy to interpret consistently,
  • not sensitive unless your design explicitly allows it.

If you need richer querying, object tags or an external catalog may be a better fit.

It is also worth deciding on key naming conventions early. Consistent keys such as source, owner, and document-type are much easier to maintain across multiple uploaders than ad hoc per-script naming.

Common Pitfalls

  • Forgetting to pass metadata at upload time and assuming it can be edited later without a copy.
  • Storing non-string values and expecting S3 to preserve their original types.
  • Replacing metadata on copy and accidentally omitting headers such as ContentType.
  • Using metadata for large or sensitive payloads that belong elsewhere.
  • Looking at object body APIs when head_object would be enough to inspect metadata.

Summary

  • Use Metadata={...} with put_object or ExtraArgs["Metadata"] with upload_file.
  • Read metadata back with head_object.
  • Updating S3 metadata usually requires copying the object with MetadataDirective="REPLACE".
  • Keep metadata small, string-based, and purposeful.
  • Preserve related headers such as ContentType when rewriting metadata.

Course illustration
Course illustration

All Rights Reserved.