OOD Fundamentals
OOP Foundations
SOLID Principles
Creational Patterns
Structural Patterns
Behavioral Patterns
Classic OOD Problems: Part 1
Classic OOD Problems: Part 2
Design a File System
Every operating system needs a way to organize data into a hierarchy of files and directories. Before writing any classes, you need to understand what a file system actually does and which responsibilities belong in an object-oriented design versus which belong in the operating system kernel.

What the System Must Do
A file system supports a small set of core operations:
- Create files and directories at a given path
- Delete files and directories (with recursive deletion for non-empty directories)
- Move a file or directory from one location to another
- Get size of a file (its own size) or a directory (the total size of everything inside it)
- List contents of a directory
- Resolve paths like
/home/user/docs/report.txtinto actual nodes in the tree
These operations reveal the central insight: files and directories are both "things in the file system" but behave differently. A file has content and a fixed size. A directory has no content of its own: it holds other files and directories. Yet both have a name, a parent, and a path. This is the classic setup for the Composite pattern.
Core Entities
From the use cases, you can extract the key classes:
| Entity | Responsibility |
FileSystemNode | Abstract base for anything in the tree: name, parent, path |
File | A leaf node with size and content |
Directory | A composite node that holds children |
FileSystem | The root-level manager that provides the public API |
Permission | Access control metadata for each node |
User | Represents who is performing operations |
Why Composite Is the Right Pattern
The Composite pattern lets you treat individual objects and groups of objects through the same interface. When client code calls getSize() on any node, it does not need to know whether that node is a file or a directory. A file returns its own size. A directory sums its children's sizes: and those children might be files or more directories. The recursion handles itself.
This uniform treatment eliminates type-checking conditionals throughout the codebase. Every operation that walks the tree, size calculation, path display, deletion, works the same way regardless of depth or structure.
File systems are one of the most natural examples of the Composite pattern. The part-whole hierarchy is inherent in the domain, not imposed by the design. When the domain itself is recursive, Composite is almost always the right choice.
Thinking Through Delete
Deletion is more nuanced than creation. Deleting a file is straightforward: remove it from its parent's children collection. But deleting a directory raises a question: what if it is not empty? You have two choices:
- Refuse to delete non-empty directories (safe, requires the user to empty it first)
- Recursive delete, remove all children first, then the directory itself
Most real file systems support both (e.g., rmdir fails on non-empty directories, while rm -rf recurses). In your design, recursive delete is another example of the Composite pattern at work: calling delete() on a directory triggers delete() on every child, which triggers delete() on their children, all the way down to the leaves. The recursion terminates when every leaf is removed.
What We Defer
An OOD interview does not expect you to model disk blocks, journaling, or memory-mapped I/O. Those are operating system internals. Focus on the object relationships and behaviors. You should also defer symbolic links initially: they introduce cycles into the tree, which complicates traversal and deletion. Mention them as an extension point, but do not let them derail the core design.
You should also set aside concurrent access for the initial design. In a real operating system, multiple processes access the file system simultaneously, requiring locks and transaction semantics. For OOD purposes, assume single-threaded access and mention concurrency as an extension point if time permits.
With the requirements clear, you can now define the class hierarchy. The goal is a clean separation between the abstract concept of "something in the file system" and the two concrete forms it takes: files that store data and directories that organize other nodes.

The FileSystemNode Base Class
Every node in the file system shares three properties: a name, a reference to its parent directory, and the ability to compute its absolute path. The base class captures this:
The get_path() method walks up the parent chain: each node only knows its own name and its parent, but by chaining those links, you reconstruct the full path. This is the same principle behind linked list traversal.
The File Class
A file is a leaf node. It has no children. For an OOD design, you model its size as a stored value rather than simulating actual content storage:
The Directory Class
A directory is a composite node. It holds children and implements get_size() by summing them recursively:
Notice that children is a dictionary keyed by name, not a list. This gives O(1) lookup by name: essential because path resolution navigates one segment at a time, and each segment is a name lookup in the current directory.
A directory must never contain itself as a child, directly or indirectly. The add_child method should validate that the new child is not an ancestor of the current directory. Without this check, get_size() and get_path() would recurse infinitely.
Path as a Value Object
Paths like /home/user/docs/report.txt appear everywhere in a file system. Rather than passing raw strings around, a Path value object parses the string into segments and provides navigation utilities:
This separates parsing logic from tree traversal logic. The resolve method walks the tree one segment at a time, which makes it easy to add . and .. handling later.
Putting It Together
The FileSystem class ties everything into a usable API:
The move operation shows why FileSystem exists as a coordinator: it removes from one directory and adds to another, a cross-cutting action that neither directory owns alone.
Design Decisions Worth Discussing
Several design choices in this model deserve explicit justification:
Parent references are bidirectional. Each node knows its parent, and each directory knows its children. This costs a bit of memory but makes get_path() possible without searching the entire tree. It also makes move straightforward: update the parent reference and the children dictionary in one step.
Names are unique within a directory. The dictionary enforces this implicitly, but add_child raises an explicit error. This matches real file systems: you cannot have two files with the same name in the same folder.
FileSystem owns path resolution. Rather than letting client code traverse the tree manually, all path-based operations go through FileSystem. This creates a single point for validation, logging, and future permission checks. If you later add caching of recently accessed paths, it happens in one place.
A file system without permissions is a shared notebook with no locks on any drawer. Anyone can read, modify, or delete anything. Real file systems enforce access control so that users can only perform operations they are authorized for. The design challenge is adding this enforcement cleanly, without scattering permission checks throughout every method.

UNIX-Style Permission Model
The most widely used permission model assigns three categories of access to every node:
- Owner, the user who created the file or directory
- Group, a set of users who share access
- Others, everyone else
Each category gets three permission bits: read, write, and execute. For files, read means viewing content, write means modifying it, and execute means running it as a program. For directories, read means listing contents, write means adding or removing children, and execute means traversing through the directory (entering it as part of a path).
The root user bypasses all permission checks. This is a deliberate design decision: system administration requires unrestricted access, and encoding this at the permission level keeps the check logic centralized.
Cascade Checks Along the Path
Accessing /home/user/docs/report.txt requires more than just checking permissions on report.txt. You must verify that the current user has execute permission on every directory in the path: /, home, user, and docs. If any directory along the way denies execute access, the operation fails: even if the target file itself grants full access.
This cascade check is why directories have an execute bit at all. It controls traversal, not execution. Without it, any user who knows a file's full path could access it regardless of the directory structure's intended protection.
The Proxy pattern is a clean way to add permission checks. Wrap each FileSystemNode with a PermissionProxy that checks access before delegating to the real node. This keeps permission logic out of File and Directory entirely: they remain focused on their structural responsibilities.
Protection Proxy Pattern
Rather than adding permission checks inside File.get_size() or Directory.add_child(), you wrap nodes with a proxy that enforces access control:
The proxy implements the same interface as the real node. Client code cannot tell the difference: it calls get_size() and either gets the answer or gets a PermissionError. This is the essence of the Proxy pattern: same interface, added control.
The advantage is separation of concerns. File and Directory know nothing about permissions. Permission knows nothing about files. The proxy bridges them. If you later switch from UNIX permissions to access control lists (ACLs), only the Permission class and the proxy change.
Default Permissions and Inheritance
When a new file or directory is created, it needs initial permissions. Two common strategies exist:
Fixed defaults: every new node gets the same permission bits (e.g., rwxr-xr-x for directories, rw-r--r-- for files). This is simple but inflexible.
Inherited from parent: a new node copies its parent directory's permissions. This means files created inside a restricted directory are automatically restricted too. The parent acts as a template.
In practice, most systems use a combination: a system-wide default (called a umask in UNIX) that masks out certain bits, combined with the creating user's identity as the owner. For an OOD interview, mentioning that permissions should be initialized at creation time and that inheritance from the parent is a reasonable default demonstrates depth of thought without overcomplicating the model.
A file system is only useful if you can find things in it. Search and traversal are operations that walk the tree, but they differ in purpose: traversal visits every node in some order, while search walks the tree looking for nodes that match specific criteria. Both benefit from patterns that decouple the walking logic from the actions performed at each node.

The Visitor Pattern for Search
When you search a file system, you might look for files by name, by extension, by size threshold, or by modification date. Each criterion is a different operation, but all share the same traversal structure. The Visitor pattern separates the operation from the tree structure.
Each node accepts the visitor:
Adding a new search criterion means writing a new visitor class. The tree classes never change. This is the Open-Closed Principle in action: you extend behavior by adding new classes, not by modifying existing ones.
DFS vs BFS Traversal
Not all searches benefit from the same traversal order. Two fundamental strategies exist:
Depth-First Search (DFS) goes as deep as possible before backtracking. It follows one branch all the way to the leaves, then moves to the next branch. DFS uses less memory because it only needs to track the current path, not all nodes at the current level.
Breadth-First Search (BFS) visits all nodes at the current depth before going deeper. It explores level by level. BFS is useful when you expect the target to be near the root or when you want to find files at a specific depth.
The Strategy pattern lets you swap between these traversal algorithms at runtime. The search method accepts a traversal strategy, making it configurable without conditionals:
Path Resolution with Dot Navigation
Real file systems support . (current directory) and .. (parent directory) in paths. An enhanced path resolver handles these:
This resolver supports both absolute paths (starting with /) and relative paths (starting from the current directory). The .. segment navigates up by following the parent reference: the same linked-list-style traversal used in get_path(), but in the opposite direction.
Combining Patterns
The real power of this design comes from combining patterns. A single search operation might use:
- Visitor to define what you are looking for (search by extension, size threshold, etc.)
- Iterator (DFS or BFS) to control the order you visit nodes
- Strategy to select the traversal algorithm at runtime
For example, a "find all files larger than 1 MB" operation creates a SizeThresholdVisitor and pairs it with a DFSIterator. A "find the shallowest directory named 'config'" operation creates a NameMatchVisitor and pairs it with a BFSIterator because BFS finds the shallowest match first.
This composability means you do not need a dedicated method for every combination of criteria and traversal order. You build the operation from small, reusable pieces. Each piece is independently testable, and new combinations require no code changes to existing classes.
When to Use Which Traversal
Here is a quick decision guide:
| Scenario | Best Traversal | Reason |
| Find a file you know is deeply nested | DFS | Reaches deep nodes quickly |
| Find the shallowest match | BFS | Visits level by level |
| Calculate total tree size | Either | Both visit every node |
| List directory contents at depth N | BFS | Naturally tracks depth |
| Memory-constrained environment | DFS | O(depth) vs O(width) space |
Both DFS and BFS have the same time complexity: both visit every node exactly once, giving O(n). The difference is in space: DFS uses O(d) where d is tree depth, while BFS uses O(w) where w is the maximum width. In a file system with many files per directory but moderate depth, DFS is typically more memory-efficient.
Practice
Build a complete file system and a search engine that traverses it.
Loading problem...
Loading problem...