Getting image dimensions without reading the entire file

image processing

file metadata

programming tips

image dimensions

software optimization

Getting image dimensions without reading the entire file

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

You usually do not need to decode an entire image just to learn its width and height. Most image formats store dimensions in the header, so a fast program can read only the first few bytes, extract the metadata, and avoid unnecessary I/O and memory use.

How image formats store dimensions

The exact location depends on the format:

PNG stores width and height in the IHDR chunk near the start of the file
GIF stores them in the logical screen descriptor
JPEG stores them in a Start Of Frame segment, which may appear after several metadata segments

That means PNG and GIF are trivial to inspect, while JPEG requires scanning markers until the right segment is found.

A small Python parser for PNG and JPEG

The following example reads only as much of the file as needed for common formats:

python

1import struct
2
3def image_size(path):
4    with open(path, "rb") as f:
5        signature = f.read(24)
6
7        if signature.startswith(b"\x89PNG\r\n\x1a\n"):
8            width, height = struct.unpack(">II", signature[16:24])
9            return width, height
10
11        if signature[:2] == b"\xff\xd8":
12            f.seek(2)
13            while True:
14                marker_prefix = f.read(1)
15                if marker_prefix != b"\xff":
16                    raise ValueError("invalid JPEG")
17
18                marker = f.read(1)
19                while marker == b"\xff":
20                    marker = f.read(1)
21
22                if marker in [b"\xc0", b"\xc1", b"\xc2", b"\xc3",
23                              b"\xc5", b"\xc6", b"\xc7",
24                              b"\xc9", b"\xca", b"\xcb",
25                              b"\xcd", b"\xce", b"\xcf"]:
26                    length = struct.unpack(">H", f.read(2))[0]
27                    _precision = f.read(1)
28                    height, width = struct.unpack(">HH", f.read(4))
29                    return width, height
30
31                length = struct.unpack(">H", f.read(2))[0]
32                f.seek(length - 2, 1)
33
34        raise ValueError("unsupported format")
35
36print(image_size("photo.png"))
37print(image_size("photo.jpg"))

This is much cheaper than loading pixel data into memory, especially when you only need layout information or validation.

Using a library can still be efficient

If you want fewer format details in your application code, Pillow is often enough. Image.open parses metadata lazily and usually avoids decoding full image data until you actually request pixels.

python

1from PIL import Image
2
3with Image.open("photo.jpg") as img:
4    print(img.size)

This is a good tradeoff when you support several formats and care more about maintainability than about writing your own parser.

Why header-only reads matter

This optimization is useful in:

upload validation, where you reject images that exceed dimension limits
web servers generating thumbnails or responsive layout metadata
batch scanners that need to inspect many files quickly
mobile and serverless environments where memory pressure matters

The gain comes from avoiding full decode work, not just from saving a few disk reads. For large JPEGs, decoding can be far more expensive than reading the header.

Format-specific caveats

JPEG is the tricky case because the width and height are not always at a fixed offset. You must scan segments safely until you reach a Start Of Frame marker that contains the dimensions.

TIFF and some modern formats are more complex because metadata can be nested or stored farther into the file. For those, a well-tested library is often safer than a hand-rolled parser unless you control the exact input set.

Common Pitfalls

Assuming every format stores dimensions at a fixed byte offset. JPEG does not.
Reading the whole file into memory first, which defeats the point of the optimization.
Confusing metadata parsing with full image decoding. They are not the same cost.
Writing a minimal parser and then silently accepting malformed files.
Ignoring image orientation metadata when your application cares about displayed dimensions rather than raw pixel dimensions.

Summary

Most image formats store dimensions in the header, so full decoding is unnecessary for width and height.
PNG and GIF are easy to inspect; JPEG needs marker scanning.
A small binary parser works well for known formats and controlled inputs.
Libraries such as Pillow often give you the same optimization with less maintenance.
Header-only reads are valuable in upload validation, batch processing, and memory-sensitive systems.