Design an In-Memory File System
An in-memory file system is a utility: it provides a service — organizing named data into a navigable tree — without owning domain logic like game rules or business processes. The design question reduces to one structural decision: how do you make a directory and a file interchangeable enough that path traversal, recursive size calculation, and listing all work through a single interface?
That decision is answered by the Composite pattern. A Node base defines the shared
interface — name, size(), is_dir(). A File leaf stores a byte count and returns it
directly. A Directory holds a map of child Node objects and delegates size() recursively
to each child. Every path operation walks the tree one segment at a time, asking each node
whether it is a directory before descending.
See It in Action
The demo below is the finished tree. Click a folder to select it, then add a subfolder or a file
inside it. Each folder reports its recursive size — add a file deep in the tree and watch every
ancestor's total update. That single recursive size() is what the Composite design buys.
Clarifying Requirements
A focused set of questions turns "design a file system" into a concrete spec. The functional
requirements we will commit to: the system maintains a single-rooted tree at
/; directories and files are created individually with explicit paths; listing a directory
returns its immediate children, sorted alphabetically; querying size returns a file's stored
byte count or a directory's total recursive byte count; and duplicate creation is rejected with
a message rather than silently overwriting. The path separator is / and paths are absolute,
starting at root.
The command flow for the four operations:
Core Entities
Scanning the requirements for the nouns that carry state gives two obvious things: a file and a directory. A file holds bytes; a directory holds other entries. The design hinges on how those two relate. One option treats them as unrelated types and branches on which one it is wherever a path is walked or a size is summed. The other gives them a shared base type so a directory can hold files and directories without caring which is which. Decide that before reading on.
Decision checkpoint
Should File and Directory share a common Node base type, or should each operation branch on whether an entry is a file or a directory?
With that settled, the entity list is three things.
A Node is the common ancestor. Every entry in the file system — whether a directory or a file — has a name and can answer two questions: what is its total byte count, and is it a directory? These two behaviors are what make path traversal and recursive size calculation polymorphic — the caller does not need to branch on type.
A File is a leaf node. It stores a fixed byte count set at creation and never changes.
Its size() returns that count directly; its is_dir() returns false. There are no children
and no state to maintain beyond the name and the count.
A Directory is a composite node. It holds a map from child names to Node objects — the
map key is the child's name so lookup is O(1) and sorted listing is a single sort call. Its
size() iterates over all children and sums their size() calls recursively; because each
child answers size() through the same Node interface, the directory does not need to know
whether any given child is a file or another directory. This is the Composite pattern in
action: uniform treatment of leaves and composites through a shared interface.
A quick ownership picture before the class design:
Directory o-- Node (aggregation) captures the key structural fact: a directory owns
zero or more child nodes, and those children are themselves Nodes — files or directories
interchangeably.
Designing the Classes
Each field and method must be traceable to a requirement. Nothing goes in because it "seems useful."
Node
The requirement "every entry has a name" gives name: str. The requirement "size works for
both files and directories" gives size() -> int as an abstract method. The requirement "ls
behaves differently for files and directories" gives is_dir() -> bool as a distinguishing
predicate. Both methods are abstract because Node itself has no concrete implementation —
it only defines the contract.
File
The requirement "touch creates a file with a given size" gives _size: int set at
construction. size() returns _size directly. is_dir() returns False. There is no
other state.
Directory
The requirement "ls returns immediate children sorted" motivates _children: dict[str, Node]
keyed by name — a dict gives O(1) lookup by name and straightforward sorted iteration.
children_sorted() returns sorted(self._children.keys()).
The recursive total for a directory has to combine the sizes of files and nested directories at arbitrary depth. Before reading the implementation, decide who owns that recursion.
Decision checkpoint
A directory's size is the recursive total of everything beneath it. Should the dispatcher walk the tree and add up file sizes, or should Directory.size() sum its children?
size() is the method that demonstrates the Composite pattern most clearly:
def size(self) -> int:
return sum(child.size() for child in self._children.values())
Every child is a Node. If the child is a File, its size() returns a leaf count. If the
child is a Directory, its size() recurses into its own children. The directory never
inspects which type each child is — it just calls size() and adds the result. This is
exactly why the Composite pattern exists: uniform treatment allows recursive algorithms to
work on arbitrarily deep trees with a single line of code.
Path helpers
Two helper functions handle the structural plumbing that the classes themselves should not own.
_traverse(root, path) walks a path from the root, returning the Node at that path or
None if any segment is missing. _parent_dir(root, path) returns the Directory that
contains the final segment — used when creating a new entry to verify the parent exists.
These helpers are thin procedures: they have no state and no place in any class.
Try It Yourself
Before reading the implementation, build it. Implement a function run_fs(instructions) that
processes a sequence of file system commands and returns one result line per command.
Each instruction is a list of strings. The commands and their expected output:
| Command | Output |
|---|---|
["mkdir", path] | "Created dir <path>" — or "<path> exists" if it already exists |
["touch", path, size] | "Created file <path>" — or "<path> exists" / "No such directory" |
["ls", path] | sorted child names space-joined, "(empty)", the filename for a file, or "No such path" |
["size", path] | the byte count as a string, or "No such path" |
Paths are absolute (/a/b/c). Parent directories in a path are assumed to exist when the
tests call touch. Build the Composite tree: Node base, File leaf, Directory composite
with recursive size().