How can I calculate the size of a folder?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Calculating the size of a folder usually means summing the sizes of all files inside it, including files in subdirectories. The tricky part is deciding whether you want the logical file size or the actual disk usage, because those are not always the same thing. The answer also depends on whether you are doing it interactively in the shell or programmatically in code.
Logical Size Versus Disk Usage
Before writing code or running a command, it helps to separate two concepts:
- logical size: the sum of file byte lengths
- disk usage: how much space the file system actually allocates
These can differ because file systems store data in blocks. A file containing 1 byte might still consume 4096 bytes on disk if the allocation unit is 4 KB.
That is why a folder’s “Size” and “Size on disk” can disagree in a GUI. Both are valid, but they answer slightly different questions.
Calculate Folder Size in Python
If you want the logical size in Python, walk the directory tree and sum file sizes:
This approach is straightforward and works well for scripts and automation. st_size is the logical file size in bytes.
If you want a human-readable result:
Shell Command on Linux and macOS
On Unix-like systems, du is usually the best tool:
Useful flags:
- '
-sgives a summary instead of listing every subdirectory' - '
-hprints human-readable units'
If you want the value in bytes on many Linux systems:
du reports disk usage, not just the sum of logical file sizes. That is often exactly what you want when checking real storage consumption.
PowerShell on Windows
In PowerShell, you can sum file lengths like this:
To print it more readably:
This gives you the logical file total, similar to the Python example.
Why Recursive Traversal Matters
A folder can contain subfolders nested many levels deep. If your method only lists the top-level directory, the total will be wrong.
That is why:
- Python examples use
rglob("*") - PowerShell uses
-Recurse - '
dutraverses recursively by default'
Any correct folder-size algorithm has to account for the full tree unless you intentionally want just the immediate children.
Symlinks, Permissions, and Other Edge Cases
Real file systems introduce a few complications.
Symbolic links may point outside the folder tree or back into it. If a tool follows symlinks blindly, it can double-count files or loop unexpectedly. The Python example above checks real files under the walked tree and does not manually follow linked targets.
Permission errors are another common issue. A directory walker may fail on subfolders the current user cannot read. A defensive version of the Python function can skip inaccessible paths:
This keeps the traversal going instead of crashing on one unreadable entry.
Which Method Should You Use
Use du when:
- you are on Linux or macOS
- you want actual disk usage quickly
- you are diagnosing storage consumption
Use Python or PowerShell when:
- you need the result inside a script
- you want logical file size
- you want custom filtering or reporting
The method is less important than being clear about what “size” means for your task.
Common Pitfalls
The biggest pitfall is mixing up logical file size with actual disk usage. A summed byte count from code is not always the same as du.
Another common mistake is forgetting recursive traversal, which undercounts nested folders badly.
Permission issues can also make totals misleading if your script silently skips directories or crashes halfway through the tree. Handle errors deliberately.
Finally, be careful with symbolic links. Depending on the tool and options, they may be counted as links, ignored, or followed to other locations.
Summary
- Folder size usually means summing all files in the directory tree.
- Decide whether you need logical size or actual disk usage.
- Use Python or PowerShell for scripted logical-size calculations.
- Use
duon Unix-like systems for quick disk-usage reporting. - Watch for recursion, permissions, and symlink behavior when totals look wrong.

