Directory-tree listing in Python
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Listing a directory tree sounds simple until you need readable output, recursive traversal, and safe handling of missing permissions or symbolic links. Python gives you solid tools for this in both os and pathlib, and the right choice depends on whether you want full control or a cleaner object-oriented API.
Recursive Listing with os.walk
The most common solution is os.walk(). It yields a tuple containing the current directory, the subdirectories under it, and the files in it. Because it walks recursively for you, it is usually the best starting point for command-line utilities and maintenance scripts.
This works well, but there is an important detail: os.walk() lists the current directory and then all children under it. If you want a prettier tree, you often need to compute indentation yourself, as shown above.
Another useful feature is that you can modify dirs in place to control traversal. That lets you skip heavy or irrelevant folders such as .git, node_modules, or build directories.
Filtering dirs[:] instead of reassigning dirs is the key detail. os.walk() reads that list to decide where to recurse next.
Using pathlib for Cleaner Code
If you prefer modern path handling, pathlib is easier to read. It does not replace every os.walk() use case, but it makes many tasks simpler, especially when you want Path objects instead of raw strings.
rglob() is excellent when your real goal is "find every file matching a pattern" rather than "manually process directory state at each level." The output is often simpler, and path manipulation becomes much clearer because Path methods like relative_to(), name, and suffix are built in.
That said, rglob() is not a direct drop-in replacement for a visual tree printer. When you need separate access to directories and files at each level, os.walk() still gives better structure.
Building a Useful Tree Printer
A production-friendly tree listing usually needs a few extras:
- deterministic ordering with
sorted() - exclusion of hidden or generated folders
- graceful handling of permission errors
- optional depth limits for very large trees
Here is a version with a depth limit:
Limiting depth is important in monorepos or large data directories where a full recursive printout becomes noisy and slow.
Common Pitfalls
One common mistake is assuming os.listdir() is recursive. It is not. It only lists one directory level, so you need additional logic if you expect nested output.
Another problem is following symbolic links without realizing it. Recursive traversal can loop or revisit content unexpectedly if your code follows links carelessly. If symbolic links matter in your project, inspect them explicitly with Path.is_symlink() and decide how to handle them.
Permission errors are also easy to ignore during local testing. On a development machine, everything may work, but production or CI environments can contain restricted directories. Wrap sensitive operations in try and except PermissionError when you need resilience.
Finally, be careful with output formatting. Developers often print directories as they are visited and then print files later, which can create confusing order. Sorting and computing indentation from the relative path keeps the tree predictable.
Summary
- Use
os.walk()when you need structured recursive traversal with control over directories and files. - Use
pathlib.Path.rglob()when you mainly want to find matching files with cleaner path handling. - Modify
dirs[:]inos.walk()to skip folders or limit recursion. - Sort directory and file names so output stays stable across runs.
- Plan for symbolic links, permission errors, and very deep trees before treating a simple script as production-ready.

