588. Design In-Memory File System

HardDesignTrieHash TableString
Leetcode Link

Problem Description

The given problem asks for the design of a data structure that acts like an in-memory file system. This file system should be able to handle various commands similar to those found in a Unix-style file system. Specifically, the file system should:

  • Initialize with no files or directories upon instantiation.
  • List all the files and directories at a given directory path, returned in lexicographic order.
  • Create new directories given a path, and if necessary, create any intermediate directories along that path.
  • Add content to a file at a given path, creating the file if it does not exist, or appending to it if it does.
  • Read and return the content from a file at a given path.

The system must be able to distinguish between file and directory paths and perform actions accordingly.

Intuition

To effectively simulate a file system, a tree-like data structure works well since a real file system often has a hierarchical structure. The solution uses a Trie data structure, which is a variant of a tree with nodes that represent part of a file or a directory path. Each node may represent a file or a directory and contains links to its child nodes, which represent files or directories within it.

Here’s how the Trie structure helps to address the problem:

  • Directories as Nodes: In the Trie, every directory is represented as a node and each node can have multiple child nodes, representing a directory or a file within it.
  • Files as Terminal Nodes: Files are represented as terminal nodes (end nodes) in the Trie with an isFile flag and contain content.
  • Inserting Directories and Files: When creating a directory or adding content to a file, the file system traverses the Trie based on the path segments. If a path doesn't exist, the necessary nodes are created.
  • Listing Entries: To list the files and directories in a path, a search function locates the node representing the path, and if it's a directory node, it returns the keys of the children map in sorted order.
  • File Content Management: The Trie also facilitates adding to and reading content from a file. When adding content, content nodes are appended. For reading content, all content nodes are concatenated together to form the file's content.

To efficiently read and return file and directory names, the system uses standard Trie operations with certain customizations for handling file content and distinguishing files from directories.

Learn more about Trie patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

What is the best way of checking if an element exists in a sorted array once in terms of time complexity? Select the best that applies.

Solution Approach

The FileSystem is implemented using a nested class [Trie](/problems/trie_intro) that represents the file system tree. Here's a breakdown of how the different parts of the system work:

  • The [Trie](/problems/trie_intro) class is the building block of the file system and represents a node in the filesystem tree. Each node includes:

    • name: the name of the file or directory.
    • isFile: a boolean flag to check if the node is a file.
    • content: a list to store the content of the file. This is used only if the node is a file.
    • children: a dictionary to map the children's names to their corresponding Trie nodes.
  • The insert method is used both to insert directories and files. It takes a file or directory path and a boolean isFile to specify if the path is that of a file. It splits the path into parts and traverses the node structure (or creates nodes if parts of the path do not exist) until it finds the right node for the path. If it's a file insertion, it sets the node's isFile flag to True and saves the name.

  • The search method is employed by other functions to locate a node at the given path. This method again splits the path and follows the nodes until it reaches the target node.

  • FileSystem uses a [Trie](/problems/trie_intro) node, root, to initialize an empty file system.

  • The ls method uses the search function to find the node corresponding to the given path and then:

    • If the node is a file, it returns a list containing only the file's name.
    • If the node is a directory, it returns a list of names of its children nodes, sorted in lexicographic order.
  • mkdir leverages the insert method to create directories. It simply calls the insert method with the path and False for the isFile parameter, which instructs it to only create directories along the path if they don't exist.

  • addContentToFile uses the insert method to either find the existing file node or create a new one with the path, and appends the content to the file's content list.

  • The readContentFromFile method fetches the file node using the search method and then returns the concatenated content from the file's content list as a single string.

This solution employs the Trie data structure to effectively provide the desired file operations with efficient insertion and search operations that are necessary for a file system.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Which of the following is a min heap?

Example Walkthrough

Let's walk through a small example to illustrate how the solution approach works using the following sequence of operations:

  1. Initialize the FileSystem: Start with an empty file system.
  2. Create Directory: Use mkdir to create a directory path /a/b/c.
  3. Add File Content: Use addContentToFile to create a file named x with content hello in directory /a/b/c.
  4. List Contents: Use ls to list contents of the directory /a/b/c.
  5. Read File Content: Use readContentFromFile to read content from the file /a/b/c/x.

Step by step:

  1. When we initialize FileSystem, we create a root node in the Trie which represents our empty directory tree.

  2. We call mkdir("/a/b/c"). The insert method:

    • Splits the path into parts: ["a", "b", "c"].
    • Checks if the root node has a child node a. It doesn't, so it creates a new Trie node a.
    • It repeats the same check for b under a, and c under b, creating any that don't exist.
  3. We then call addContentToFile("/a/b/c/x", "hello"). The insert method:

    • Split the path into parts: ["a", "b", "c", "x"].
    • Traverses down the structure - it finds nodes a, b, and c already exist.
    • It checks if c has a child node x which represents the file. Since x doesn’t exist yet, it creates a new node with isFile set to True.
    • It sets the new node's content to ["hello"].
  4. We call ls("/a/b/c"). The ls method:

    • Uses search to find the node for path ["a", "b", "c"].
    • This node is a directory, so it retrieves its children nodes.
    • It returns the children names in sorted lexicographic order. In this case, it's ["x"] since x is the only entry.
  5. Finally, we call readContentFromFile("/a/b/c/x"). The readContentFromFile method:

    • Calls search to find the node for file path ["a", "b", "c", "x"].
    • Since this is a file node, it retrieves its content list – ["hello"].
    • Returns the content by concatenating the list items into a string hello.

From these operations, we can see how each piece of the Trie helps simulate the necessary file system commands and the manner in which the file and directory nodes are created, searched for, and manipulated.

Solution Implementation

1from typing import List
2
3class TrieNode:
4    def __init__(self):
5        # Initialize a Trie node with the appropriate attributes
6        self.name = None
7        self.is_file = False
8        self.content = []
9        self.children = {}
10  
11    def insert(self, path: str, is_file: bool) -> 'TrieNode':
12        # Insert a path into the Trie and return the final node
13        node = self
14        parts = path.split('/')
15        for part in parts[1:]:  # Skip empty root part
16            if part not in node.children:
17                node.children[part] = TrieNode()
18            node = node.children[part]
19        node.is_file = is_file
20        if is_file:
21            node.name = parts[-1]
22        return node
23  
24    def search(self, path: str) -> 'TrieNode':
25        # Search for a node given a path in the Trie
26        node = self
27        if path == '/':
28            return node
29        parts = path.split('/')
30        for part in parts[1:]: # Skip empty root part
31            if part not in node.children:
32                return None
33            node = node.children[part]
34        return node
35
36
37class FileSystem:
38    def __init__(self):
39        self.root = TrieNode()
40
41    def ls(self, path: str) -> List[str]:
42        # List directory or file
43        node = self.root.search(path)
44        if node is None:
45            return []
46        if node.is_file:
47            # If it's a file, return a list with its name
48            return [node.name]
49        # If it's a directory, return the sorted list of children's names
50        return sorted(node.children.keys())
51
52    def mkdir(self, path: str) -> None:
53        # Create a directory given a path
54        self.root.insert(path, False)
55
56    def addContentToFile(self, filePath: str, content: str) -> None:
57        # Add content to a file, creating the file if it doesn't exist
58        node = self.root.insert(filePath, True)
59        node.content.append(content)
60
61    def readContentFromFile(self, filePath: str) -> str:
62        # Read content from a file
63        node = self.root.search(filePath)
64        if node is None or not node.is_file:
65            raise FileNotFoundError(f"File not found: {filePath}")
66        return ''.join(node.content)
67
1import java.util.ArrayList;
2import java.util.Collections;
3import java.util.HashMap;
4import java.util.List;
5import java.util.Map;
6
7// Definition for a Trie Node which is used internally in FileSystem.
8class TrieNode {
9    String name; // Name of the file or directory
10    boolean isFile; // Flag indicating whether it is a file or not
11    StringBuilder content = new StringBuilder(); // Content of the file if it is a file
12    Map<String, TrieNode> children = new HashMap<>(); // Child nodes, representing files and directories
13
14    // Method to insert a node and return the last node in the path.
15    TrieNode insert(String path, boolean isFile) {
16        TrieNode node = this;
17        String[] parts = path.split("/");
18
19        for (int i = 1; i < parts.length; ++i) {
20            String part = parts[i];
21            if (!node.children.containsKey(part)) {
22                node.children.put(part, new TrieNode());
23            }
24            node = node.children.get(part);
25        }
26
27        node.isFile = isFile;
28        if (isFile) {
29            node.name = parts[parts.length - 1]; // Set the name of the file
30        }
31
32        return node;
33    }
34
35    // Method to search for a node given a path.
36    TrieNode search(String path) {
37        TrieNode node = this;
38        String[] parts = path.split("/");
39
40        for (int i = 1; i < parts.length; ++i) {
41            String part = parts[i];
42            if (!node.children.containsKey(part)) {
43                return null; // Node not found
44            }
45            node = node.children.get(part);
46        }
47
48        return node; // Node found
49    }
50}
51
52// Definition of a FileSystem that uses a TrieNode structure to manage files and directories.
53class FileSystem {
54    private TrieNode root = new TrieNode(); // Root directory of the file system
55
56    // Constructor for the FileSystem
57    public FileSystem() {
58        // The root does not need any initial setup.
59    }
60
61    // List the contents in a directory or return the file if it's just a single file.
62    public List<String> ls(String path) {
63        List<String> result = new ArrayList<>();
64        TrieNode node = root.search(path);
65        if (node == null) {
66            return result; // Empty list if path not found
67        }
68      
69        if (node.isFile) {
70            // If it's a file, add the file name to the result list
71            result.add(node.name);
72        } else {
73            // If it's a directory, add all the child names to the list
74            result.addAll(node.children.keySet());
75        }
76        Collections.sort(result); // Sort the list lexicographically
77        return result;
78    }
79
80    // Make a directory given a path.
81    public void mkdir(String path) {
82        root.insert(path, false);
83    }
84
85    // Add content to the file, create the file if it does not exist.
86    public void addContentToFile(String filePath, String content) {
87        TrieNode node = root.insert(filePath, true);
88        node.content.append(content); // Append the content to the file's content
89    }
90
91    // Read content from a file.
92    public String readContentFromFile(String filePath) {
93        TrieNode node = root.search(filePath);
94        if (node != null && node.isFile) {
95            return node.content.toString(); // Return file content as a String
96        }
97        return ""; // Return empty String if file does not exist or if it's a directory
98    }
99}
100
101/*
102 * Your FileSystem object will be instantiated and called as such:
103 * FileSystem obj = new FileSystem();
104 * List<String> param_1 = obj.ls(path);
105 * obj.mkdir(path);
106 * obj.addContentToFile(filePath, content);
107 * String param_4 = obj.readContentFromFile(filePath);
108 */
109
1#include <vector>
2#include <string>
3#include <unordered_map>
4#include <algorithm>
5
6// Definition for a Trie Node which is used internally in FileSystem.
7class TrieNode {
8public:
9    std::string name; // Name of the file or directory
10    bool isFile; // Flag indicating whether it is a file or not
11    std::string content; // Content of the file if it is a file
12    std::unordered_map<std::string, TrieNode*> children; // Child nodes, representing files and directories
13
14    // Constructor
15    TrieNode() : isFile(false), name("") {}
16
17    // Destructor to clean up dynamic allocations.
18    ~TrieNode() {
19        for (auto child : children) {
20            delete child.second;
21        }
22    }
23
24    // Method to insert a node and return the last node in the path. Creates intermediate directories as needed.
25    TrieNode* insert(const std::string& path, bool fileStatus) {
26        TrieNode* node = this;
27        size_t prevPos = 1; // Start iterating the path from the character following the initial '/'
28        size_t currentPos = path.find('/', prevPos);
29
30        while (currentPos != std::string::npos) {
31            std::string part = path.substr(prevPos, currentPos - prevPos);
32            if (!node->children.count(part)) {
33                node->children[part] = new TrieNode(); // If child does not exist, create it
34            }
35            node = node->children[part];
36            prevPos = currentPos + 1;
37            currentPos = path.find('/', prevPos);
38        }
39
40        // handle the last part of the path
41        std::string part = path.substr(prevPos);
42        if (!node->children.count(part)) {
43            node->children[part] = new TrieNode(); // If child does not exist, create it
44        }
45        node = node->children[part];
46
47        node->isFile = fileStatus;
48        if (fileStatus) {
49            node->name = part; // Set the name of the file
50        }
51
52        return node;
53    }
54
55    // Method to search for a node given a path.
56    TrieNode* search(const std::string& path) {
57        TrieNode* node = this;
58        size_t prevPos = 1; // Start iterating the path from the character following the initial '/'
59        size_t currentPos = path.find('/', prevPos);
60
61        while (currentPos != std::string::npos) {
62            std::string part = path.substr(prevPos, currentPos - prevPos);
63            if (!node->children.count(part)) {
64                return nullptr; // Node not found
65            }
66            node = node->children[part];
67            prevPos = currentPos + 1;
68            currentPos = path.find('/', prevPos);
69        }
70
71        // Handle the last part of the path
72        std::string part = path.substr(prevPos);
73        if (!node->children.count(part)) {
74            return nullptr; // Node not found
75        }
76        return node->children[part]; // Node found
77    }
78};
79
80// Definition of a FileSystem that uses a TrieNode structure to manage files and directories.
81class FileSystem {
82private:
83    TrieNode* root; // Root directory of the file system
84
85public:
86    // Constructor for the FileSystem
87    FileSystem() {
88        root = new TrieNode(); // Initialize root
89    }
90
91    // Destructor to clean up dynamic allocations.
92    ~FileSystem() {
93        delete root;
94    }
95
96    // List the contents in a directory or return the file if it's just a single file.
97    std::vector<std::string> ls(const std::string& path) {
98        std::vector<std::string> result;
99        TrieNode* node = root->search(path);
100        if (!node) {
101            return result; // Empty vector if path not found
102        }
103
104        if (node->isFile) {
105            // If it's a file, add the file name to the result list
106            result.push_back(node->name);
107        } else {
108            // If it's a directory, add all the child names to the list
109            for (auto& child : node->children) {
110                result.push_back(child.first);
111            }
112        }
113        std::sort(result.begin(), result.end()); // Sort the list lexicographically
114        return result;
115    }
116
117    // Make a directory given a path.
118    void mkdir(const std::string& path) {
119        root->insert(path, false);
120    }
121
122    // Add content to the file, create the file if it does not exist.
123    void addContentToFile(const std::string& filePath, const std::string& content) {
124        TrieNode* node = root->insert(filePath, true);
125        node->content += content; // Append the content to the file's content
126    }
127
128    // Read content from a file.
129    std::string readContentFromFile(const std::string& filePath) {
130        TrieNode* node = root->search(filePath);
131        if (node && node->isFile) {
132            return node->content; // Return file content as a String
133        }
134        return ""; // Return empty String if file does not exist or if it's a directory
135    }
136};
137
1// A TypeScript representation of a file system using a Trie data structure
2interface TreeNode {
3    name?: string;
4    isFile: boolean;
5    content: string[];
6    children: Map<string, TreeNode>;
7}
8
9// Initializes the root of file system data structure
10const root: TreeNode = {
11    name: '',
12    isFile: false,
13    content: [],
14    children: new Map<string, TreeNode>()
15};
16
17// Function to insert a node and return the last node in the path.
18function insertNode(path: string, isFile: boolean): TreeNode {
19    let node = root;
20    const parts = path.split('/');
21
22    for (let i = 1; i < parts.length; i++) {
23        const part = parts[i];
24        if (!node.children.has(part)) {
25            node.children.set(part, { isFile: false, content: [], children: new Map() });
26        }
27        node = node.children.get(part) as TreeNode;
28    }
29
30    node.isFile = isFile;
31    if (isFile) {
32        node.name = parts[parts.length - 1];
33    }
34
35    return node;
36}
37
38// Function to search for a node given a path.
39function searchNode(path: string): TreeNode | null {
40    let node = root;
41    const parts = path.split('/');
42
43    for (let i = 1; i < parts.length; i++) {
44        const part = parts[i];
45        if (!node.children.has(part)) {
46            return null;
47        }
48        node = node.children.get(part) as TreeNode;
49    }
50
51    return node;
52}
53
54// Lists the contents in a directory or returns the file name if it's a single file.
55function ls(path: string): string[] {
56    const result: string[] = [];
57    const node = searchNode(path);
58
59    if (node === null) {
60        return result;
61    }
62
63    if (node.isFile) {
64        result.push(node.name as string);
65    } else {
66        for (const childName of node.children.keys()) {
67            result.push(childName);
68        }
69    }
70
71    result.sort();
72    return result;
73}
74
75// Creates a directory given a path.
76function makeDirectory(path: string): void {
77    insertNode(path, false);
78}
79
80// Adds content to the file, or creates the file if it does not exist.
81function addContentToFile(filePath: string, content: string): void {
82    const node = insertNode(filePath, true);
83    node.content.push(content);
84}
85
86// Reads content from a file.
87function readContentFromFile(filePath: string): string {
88    const node = searchNode(filePath);
89
90    if (node !== null && node.isFile) {
91        return node.content.join('');
92    }
93
94    return '';
95}
96
97// Sample usage
98// const fileSystem = new FileSystem();
99// const lsResult: string[] = fileSystem.ls(path);
100// fileSystem.mkdir(path);
101// fileSystem.addContentToFile(filePath, content);
102// const fileContent: string = fileSystem.readContentFromFile(filePath);
103
Not Sure What to Study? Take the 2-min Quiz:

What is an advantages of top-down dynamic programming vs bottom-up dynamic programming?

Time and Space Complexity

Time Complexity

  • For the insert method:

    • The time complexity is O(m) where m is the length of the path (number of directories in the path). Since the path is split into parts and there is a check and potentially an insertion for each part, the time taken is linear w.r.t the number of parts.
  • For the search method:

    • Similarly, the time complexity is O(m) where m is the length of the path. This is because the operation has to potentially traverse that many nodes in the worst-case scenario to find if the path exists in the trie.
  • For the ls method:

    • The time complexity is O(m + nlogn) where m is the length of the path and n is the number of entries (files and directories) in the final directory. This is because it takes O(m) time to search the node and O(nlogn) to sort the keys if it is a directory. If it's a file, it returns immediately, and thus the complexity would be O(m).
  • For the mkdir method:

    • This calls the insert method with isFile set to False, and thus it has the same complexity as insert, which is O(m).
  • For the addContentToFile method:

    • This method calls insert which takes O(m) time, and appends content to the file. The append operation can be considered as O(1) given that Python lists are dynamic arrays. So, the total time complexity is O(m).
  • For the readContentFromFile method:

    • This involves searching the file node which takes O(m) and then concatenating the content which takes O(k) where k is the total length of the content. Thus the time complexity is O(m + k).

Space Complexity

  • The space complexity of both the Trie and FileSystem is related to the number of unique paths stored and the total content.

  • For the Trie class:

    • The space complexity is based on the number of nodes and content of the files. In the worst case, if all paths are unique with no common prefixes, and the paths are of length m, the space complexity would be O(nm) where n is the number of unique paths.
  • For content storage:

    • The total space taken is the sum of the space for storing the content for all files, which can be considered as O(t) where t is the total length of the content across all files.
  • Overall, the combined space complexity for the Trie structure and file content storage is O(nm + t).

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which data structure is used to implement priority queue?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫