Python
Programming
File Handling
Code Snippet
Python Tips

Getting file size in Python?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Getting a file size in Python is straightforward, but the best API depends on whether you are already using pathlib or older os.path functions. In both cases, Python asks the operating system for file metadata and returns the size in bytes.

The Most Common Approaches

If you are already using pathlib, Path.stat().st_size is the cleanest option:

python
1from pathlib import Path
2
3path = Path("example.txt")
4size_in_bytes = path.stat().st_size
5print(size_in_bytes)

If you are working with older path strings, os.path.getsize is equally valid:

python
1import os
2
3size_in_bytes = os.path.getsize("example.txt")
4print(size_in_bytes)

Both return the size in bytes and rely on the same underlying file metadata.

What stat() Gives You

stat() returns more than file size. It also provides timestamps, permission bits, and other metadata:

python
1from pathlib import Path
2
3path = Path("report.csv")
4info = path.stat()
5
6print("size:", info.st_size)
7print("modified:", info.st_mtime)
8print("mode:", info.st_mode)

If you need several pieces of metadata, calling stat() once and reusing the result is cleaner than several separate filesystem lookups.

Handle Missing Files Explicitly

If the path does not exist, metadata lookup raises an exception:

python
1from pathlib import Path
2
3path = Path("missing.txt")
4
5try:
6    size_in_bytes = path.stat().st_size
7except FileNotFoundError:
8    print("file does not exist")
9else:
10    print(size_in_bytes)

That is better than assuming the file is present, especially when the path comes from user input or another system.

Convert Bytes to Human-Readable Units

Users often want to see KB, MB, or GB instead of raw bytes. A helper function can do the conversion:

python
1from pathlib import Path
2
3def format_size(num_bytes: int) -> str:
4    units = ["B", "KB", "MB", "GB", "TB"]
5    value = float(num_bytes)
6
7    for unit in units:
8        if value < 1024 or unit == units[-1]:
9            return f"{value:.2f} {unit}"
10        value /= 1024
11
12path = Path("video.mp4")
13print(format_size(path.stat().st_size))

This is display logic only. Keep the raw byte count for comparisons and limits.

Files and Directories Are Different Problems

st_size on a directory does not tell you the total size of all files inside that directory tree. It reports metadata about the directory entry itself.

If you want the size of everything inside a directory, you must walk it:

python
1from pathlib import Path
2
3def directory_size(root: Path) -> int:
4    total = 0
5    for path in root.rglob("*"):
6        if path.is_file():
7            total += path.stat().st_size
8    return total
9
10print(directory_size(Path(".")))

That distinction matters in cleanup tools, storage analyzers, and backup scripts.

pathlib vs os.path

Both are fine, but pathlib tends to produce more readable code when you are already doing other path operations:

python
1from pathlib import Path
2
3path = Path("logs") / "app.log"
4print(path.exists())
5print(path.stat().st_size)

If you are maintaining older code that already uses os.path, there is no need to rewrite everything only for file-size lookup.

For ordinary files, the common APIs work exactly as expected. But symbolic links and special files can introduce ambiguity about whether you want the link's own metadata or the target's metadata. That matters more in low-level tooling than in everyday scripts, but it is worth remembering when results look odd.

Common Pitfalls

  • Forgetting that file size is returned in bytes.
  • Calling st_size on a directory and expecting the total size of everything under it.
  • Ignoring FileNotFoundError when the file may not exist.
  • Repeating stat() calls unnecessarily when one lookup would be enough.
  • Confusing display formatting with the actual stored numeric size.

Summary

  • Use Path.stat().st_size or os.path.getsize to read file size in bytes.
  • 'pathlib is often the most readable choice in modern Python code.'
  • Handle missing files explicitly instead of assuming the path exists.
  • Convert bytes to human-readable units only for display.
  • Directory tree size is a separate problem that requires walking the files recursively.

Course illustration
Course illustration