Folder size calculation
calculate folder size
measure folder size
folder size estimation
directory size calculation

How can I calculate the size of a folder?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Calculating the size of a folder usually means summing the sizes of all files inside it, including files in subdirectories. The tricky part is deciding whether you want the logical file size or the actual disk usage, because those are not always the same thing. The answer also depends on whether you are doing it interactively in the shell or programmatically in code.

Logical Size Versus Disk Usage

Before writing code or running a command, it helps to separate two concepts:

  • logical size: the sum of file byte lengths
  • disk usage: how much space the file system actually allocates

These can differ because file systems store data in blocks. A file containing 1 byte might still consume 4096 bytes on disk if the allocation unit is 4 KB.

That is why a folder’s “Size” and “Size on disk” can disagree in a GUI. Both are valid, but they answer slightly different questions.

Calculate Folder Size in Python

If you want the logical size in Python, walk the directory tree and sum file sizes:

python
1from pathlib import Path
2
3def folder_size_bytes(path: str) -> int:
4    total = 0
5    root = Path(path)
6
7    for file_path in root.rglob("*"):
8        if file_path.is_file():
9            total += file_path.stat().st_size
10
11    return total
12
13size = folder_size_bytes("/tmp/example")
14print(size)

This approach is straightforward and works well for scripts and automation. st_size is the logical file size in bytes.

If you want a human-readable result:

python
1def format_bytes(num_bytes: int) -> str:
2    units = ["B", "KB", "MB", "GB", "TB"]
3    value = float(num_bytes)
4
5    for unit in units:
6        if value < 1024 or unit == units[-1]:
7            return f"{value:.2f} {unit}"
8        value /= 1024
9
10print(format_bytes(size))

Shell Command on Linux and macOS

On Unix-like systems, du is usually the best tool:

bash
du -sh /path/to/folder

Useful flags:

  • '-s gives a summary instead of listing every subdirectory'
  • '-h prints human-readable units'

If you want the value in bytes on many Linux systems:

bash
du -sb /path/to/folder

du reports disk usage, not just the sum of logical file sizes. That is often exactly what you want when checking real storage consumption.

PowerShell on Windows

In PowerShell, you can sum file lengths like this:

powershell
$size = (Get-ChildItem "C:\Data" -Recurse -File | Measure-Object -Property Length -Sum).Sum
$size

To print it more readably:

powershell
"{0:N2} MB" -f ($size / 1MB)

This gives you the logical file total, similar to the Python example.

Why Recursive Traversal Matters

A folder can contain subfolders nested many levels deep. If your method only lists the top-level directory, the total will be wrong.

That is why:

  • Python examples use rglob("*")
  • PowerShell uses -Recurse
  • 'du traverses recursively by default'

Any correct folder-size algorithm has to account for the full tree unless you intentionally want just the immediate children.

Real file systems introduce a few complications.

Symbolic links may point outside the folder tree or back into it. If a tool follows symlinks blindly, it can double-count files or loop unexpectedly. The Python example above checks real files under the walked tree and does not manually follow linked targets.

Permission errors are another common issue. A directory walker may fail on subfolders the current user cannot read. A defensive version of the Python function can skip inaccessible paths:

python
1from pathlib import Path
2
3def safe_folder_size_bytes(path: str) -> int:
4    total = 0
5
6    for file_path in Path(path).rglob("*"):
7        try:
8            if file_path.is_file():
9                total += file_path.stat().st_size
10        except OSError:
11            continue
12
13    return total

This keeps the traversal going instead of crashing on one unreadable entry.

Which Method Should You Use

Use du when:

  • you are on Linux or macOS
  • you want actual disk usage quickly
  • you are diagnosing storage consumption

Use Python or PowerShell when:

  • you need the result inside a script
  • you want logical file size
  • you want custom filtering or reporting

The method is less important than being clear about what “size” means for your task.

Common Pitfalls

The biggest pitfall is mixing up logical file size with actual disk usage. A summed byte count from code is not always the same as du.

Another common mistake is forgetting recursive traversal, which undercounts nested folders badly.

Permission issues can also make totals misleading if your script silently skips directories or crashes halfway through the tree. Handle errors deliberately.

Finally, be careful with symbolic links. Depending on the tool and options, they may be counted as links, ignored, or followed to other locations.

Summary

  • Folder size usually means summing all files in the directory tree.
  • Decide whether you need logical size or actual disk usage.
  • Use Python or PowerShell for scripted logical-size calculations.
  • Use du on Unix-like systems for quick disk-usage reporting.
  • Watch for recursion, permissions, and symlink behavior when totals look wrong.

Course illustration
Course illustration

All Rights Reserved.