Facebook Pixel

133. Clone Graph

Problem Description

You are given a reference to a node in a connected undirected graph. Your task is to create a deep copy (clone) of the entire graph and return the reference to the copied version of the given node.

Each node in the graph has two properties:

  • val: an integer value
  • neighbors: a list of references to its neighbor nodes

The graph structure is defined as:

class Node {
    public int val;
    public List<Node> neighbors;
}

Important details:

  • The graph is connected, meaning you can reach any node from any other node
  • The graph is undirected, so if node A has node B as a neighbor, then node B also has node A as a neighbor
  • For simplicity in test cases, each node's value equals its index (1-indexed). The first node has val = 1, the second has val = 2, and so on
  • The input node will always be the first node with val = 1
  • If the input is null, return null

What is a deep copy?

A deep copy means creating entirely new node objects with the same values and the same neighbor relationships as the original graph. The cloned graph should be completely independent of the original - no nodes in the cloned graph should reference any nodes in the original graph.

Example:

If you have a graph with 4 nodes where:

  • Node 1 connects to nodes 2 and 4
  • Node 2 connects to nodes 1 and 3
  • Node 3 connects to nodes 2 and 4
  • Node 4 connects to nodes 1 and 3

Your function should create 4 new node objects with the same connections, and return the reference to the new Node 1.

Flowchart Walkthrough

First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:

Is it a graph?

  • Yes: The problem explicitly states we're working with an undirected graph where nodes have neighbors, which represents the classic graph structure with nodes and edges.

Is it a tree?

  • No: The problem describes an undirected graph that can have cycles. Trees are acyclic connected graphs, but this problem allows for cycles (nodes can have multiple paths between them).

Is the problem related to directed acyclic graphs?

  • No: The graph is undirected, not directed, and it may contain cycles.

Is the problem related to shortest paths?

  • No: We're not finding shortest paths between nodes. We need to create a complete copy of the graph structure.

Does the problem involve connectivity?

  • No: While the graph is connected, the problem isn't about finding connected components or checking connectivity. It's about traversing and copying the entire graph structure.

Does the problem have small constraints?

  • Yes: Graph cloning typically involves graphs with reasonable sizes that can be handled with recursive approaches. The problem doesn't specify huge constraints that would require optimization.

DFS/backtracking?

  • Yes: We need to traverse the entire graph to create a deep copy, visiting each node and its neighbors. DFS is perfect for this as we need to explore each path completely to clone all connections.

Conclusion: The flowchart suggests using DFS (Depth-First Search) for cloning the graph. DFS allows us to traverse the graph recursively, creating new nodes as we go and maintaining a mapping between original and cloned nodes to handle cycles and already-visited nodes.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we need to create a deep copy of a graph, the main challenge is handling the graph's interconnected nature. Unlike copying a simple array or linked list where we move linearly, a graph has nodes that can reference each other in complex ways, including cycles.

Think about what happens when we try to copy a node: we need to create a new node with the same value, but we also need to copy all its neighbors. However, when we try to copy the neighbors, we might encounter nodes we've already seen or even the original node itself (in case of cycles). This creates two key problems:

  1. Avoiding infinite loops: If node A has neighbor B, and B has neighbor A, we could get stuck copying A β†’ B β†’ A β†’ B forever
  2. Maintaining correct references: We need to ensure that the cloned nodes point to other cloned nodes, not to the original nodes

The solution is to use a hash table to track which nodes we've already cloned. The hash table maps each original node to its corresponding cloned node. This serves two purposes:

  • It tells us if we've already visited and cloned a node (preventing infinite loops)
  • It gives us quick access to the cloned version of any node we've already processed

With this tracking mechanism in place, we can use DFS to traverse the graph:

  1. Start with the given node
  2. If we've already cloned this node (it exists in our hash table), return the clone
  3. If not, create a new node with the same value
  4. Store the mapping (original β†’ clone) in our hash table immediately to handle cycles
  5. Recursively clone all neighbors and add them to the new node's neighbor list
  6. Return the cloned node

The key insight is that we must add the node to our hash table before we process its neighbors. This way, if we encounter this node again while processing its neighbors (due to cycles), we can immediately return the already-created clone instead of creating duplicates or falling into infinite recursion.

This approach ensures that each node in the original graph is cloned exactly once, and all the relationships between nodes are preserved in the cloned graph.

Solution Approach

The implementation uses a hash table combined with DFS to create a deep copy of the graph. Let's break down the solution step by step:

Data Structure:

  • We use a hash table g (implemented as a defaultdict) to store the mapping between original nodes and their cloned counterparts
  • Key: original node reference
  • Value: cloned node reference

DFS Function Implementation:

The core of the solution is the recursive dfs(node) function that returns the cloned version of the input node:

  1. Base Case - Null Check:

    if node is None:
        return None

    If the input node is null, we return null immediately.

  2. Check if Already Cloned:

    if node in g:
        return g[node]

    Before creating a new node, we check if we've already cloned this node. If yes, we return the existing clone from the hash table. This prevents infinite recursion when encountering cycles.

  3. Create Clone and Store Mapping:

    cloned = Node(node.val)
    g[node] = cloned

    We create a new node with the same value as the original and immediately store the mapping in our hash table. This is crucial - we must store the mapping before processing neighbors to handle cycles correctly.

  4. Recursively Clone Neighbors:

    for nxt in node.neighbors:
        cloned.neighbors.append(dfs(nxt))

    For each neighbor of the original node, we recursively call dfs to get its clone and add it to the cloned node's neighbor list. The recursive call will either:

    • Return an already cloned node if we've seen it before
    • Create a new clone if it's a new node
  5. Return the Cloned Node:

    return cloned

Main Function: The main function simply initializes an empty hash table and calls dfs with the input node:

g = defaultdict()
return dfs(node)

Time Complexity: O(N + E) where N is the number of nodes and E is the number of edges. We visit each node once and traverse each edge once.

Space Complexity: O(N) for the hash table storing the mapping and the recursion stack in the worst case (when the graph is a long chain).

The beauty of this approach is that it handles all the complexities of graph cloning - cycles, self-loops, and multiple references to the same node - elegantly through the combination of memoization (hash table) and recursive traversal (DFS).

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through cloning a simple graph with 3 nodes to illustrate the solution approach:

Original Graph:

  • Node 1: neighbors = [2, 3]
  • Node 2: neighbors = [1, 3]
  • Node 3: neighbors = [1, 2]

This forms a triangle where all nodes are connected to each other.

Step-by-step execution:

  1. Start with Node 1

    • Check hash table g: Node 1 not found
    • Create clone1 with val=1
    • Store in hash table: g[node1] = clone1
    • Process neighbors: [2, 3]
  2. Recursively process Node 2 (first neighbor of Node 1)

    • Check hash table: Node 2 not found
    • Create clone2 with val=2
    • Store in hash table: g[node2] = clone2
    • Process Node 2's neighbors: [1, 3]
  3. Process Node 1 (first neighbor of Node 2)

    • Check hash table: Node 1 found! (we stored it in step 1)
    • Return clone1 from hash table
    • Add clone1 to clone2.neighbors
  4. Process Node 3 (second neighbor of Node 2)

    • Check hash table: Node 3 not found
    • Create clone3 with val=3
    • Store in hash table: g[node3] = clone3
    • Process Node 3's neighbors: [1, 2]
  5. Process Node 1 (first neighbor of Node 3)

    • Check hash table: Node 1 found!
    • Return clone1 from hash table
    • Add clone1 to clone3.neighbors
  6. Process Node 2 (second neighbor of Node 3)

    • Check hash table: Node 2 found!
    • Return clone2 from hash table
    • Add clone2 to clone3.neighbors
    • Return clone3 back to step 4
  7. Back to step 4

    • Add clone3 to clone2.neighbors
    • Return clone2 back to step 2
  8. Back to step 2

    • Add clone2 to clone1.neighbors
  9. Process Node 3 (second neighbor of Node 1)

    • Check hash table: Node 3 found! (created in step 4)
    • Return clone3 from hash table
    • Add clone3 to clone1.neighbors
  10. Complete

    • Return clone1

Final Result:

  • clone1: val=1, neighbors=[clone2, clone3]
  • clone2: val=2, neighbors=[clone1, clone3]
  • clone3: val=3, neighbors=[clone1, clone2]

The key insight: When we encounter a node we've already cloned (like Node 1 in step 3), we immediately return the existing clone from the hash table instead of creating a duplicate or falling into infinite recursion. This is why storing the mapping in the hash table before processing neighbors is crucial.

Solution Implementation

1"""
2# Definition for a Node.
3class Node:
4    def __init__(self, val = 0, neighbors = None):
5        self.val = val
6        self.neighbors = neighbors if neighbors is not None else []
7"""
8
9from typing import Optional
10
11
12class Solution:
13    def cloneGraph(self, node: Optional["Node"]) -> Optional["Node"]:
14        """
15        Clone an undirected graph using depth-first search.
16      
17        Args:
18            node: The starting node of the graph to clone
19          
20        Returns:
21            The starting node of the cloned graph
22        """
23      
24        def dfs(current_node: Optional["Node"]) -> Optional["Node"]:
25            """
26            Recursively clone nodes using DFS traversal.
27          
28            Args:
29                current_node: The node being processed
30              
31            Returns:
32                The cloned version of current_node
33            """
34            # Base case: if node is None, return None
35            if current_node is None:
36                return None
37          
38            # If node has already been cloned, return the cloned version
39            if current_node in visited_to_cloned:
40                return visited_to_cloned[current_node]
41          
42            # Create a new node with the same value
43            cloned_node = Node(current_node.val)
44          
45            # Mark this node as visited by storing the mapping
46            visited_to_cloned[current_node] = cloned_node
47          
48            # Recursively clone all neighbors
49            for neighbor in current_node.neighbors:
50                cloned_node.neighbors.append(dfs(neighbor))
51          
52            return cloned_node
53      
54        # Dictionary to map original nodes to their cloned counterparts
55        # This prevents infinite recursion and ensures each node is cloned only once
56        visited_to_cloned = {}
57      
58        # Start the DFS traversal from the given node
59        return dfs(node)
60
1/*
2// Definition for a Node.
3class Node {
4    public int val;
5    public List<Node> neighbors;
6    public Node() {
7        val = 0;
8        neighbors = new ArrayList<Node>();
9    }
10    public Node(int _val) {
11        val = _val;
12        neighbors = new ArrayList<Node>();
13    }
14    public Node(int _val, ArrayList<Node> _neighbors) {
15        val = _val;
16        neighbors = _neighbors;
17    }
18}
19*/
20
21class Solution {
22    // HashMap to store the mapping from original nodes to cloned nodes
23    // This prevents infinite loops and ensures each node is cloned only once
24    private Map<Node, Node> visitedNodes = new HashMap<>();
25
26    /**
27     * Clones an undirected graph given a reference node
28     * @param node The starting node of the graph to be cloned
29     * @return The starting node of the cloned graph
30     */
31    public Node cloneGraph(Node node) {
32        return deepFirstSearch(node);
33    }
34
35    /**
36     * Performs depth-first search to clone the graph recursively
37     * @param node The current node being processed
38     * @return The cloned version of the current node
39     */
40    private Node deepFirstSearch(Node node) {
41        // Base case: if node is null, return null
42        if (node == null) {
43            return null;
44        }
45      
46        // Check if this node has already been cloned
47        Node clonedNode = visitedNodes.get(node);
48      
49        // If the node hasn't been cloned yet, create a new clone
50        if (clonedNode == null) {
51            // Create a new node with the same value as the original
52            clonedNode = new Node(node.val);
53          
54            // Mark this node as visited by storing the mapping
55            visitedNodes.put(node, clonedNode);
56          
57            // Recursively clone all neighbors and add them to the cloned node
58            for (Node neighbor : node.neighbors) {
59                clonedNode.neighbors.add(deepFirstSearch(neighbor));
60            }
61        }
62      
63        // Return the cloned node
64        return clonedNode;
65    }
66}
67
1/*
2// Definition for a Node.
3class Node {
4public:
5    int val;
6    vector<Node*> neighbors;
7    Node() {
8        val = 0;
9        neighbors = vector<Node*>();
10    }
11    Node(int _val) {
12        val = _val;
13        neighbors = vector<Node*>();
14    }
15    Node(int _val, vector<Node*> _neighbors) {
16        val = _val;
17        neighbors = _neighbors;
18    }
19};
20*/
21
22class Solution {
23public:
24    Node* cloneGraph(Node* node) {
25        // Map to store the mapping from original nodes to cloned nodes
26        // This prevents infinite recursion and ensures each node is cloned only once
27        unordered_map<Node*, Node*> originalToClone;
28      
29        // Recursive DFS function to clone the graph
30        auto dfs = [&](this auto&& dfs, Node* currentNode) -> Node* {
31            // Base case: if node is null, return null
32            if (!currentNode) {
33                return nullptr;
34            }
35          
36            // If this node has already been cloned, return the cloned version
37            // This handles cycles in the graph
38            if (originalToClone.contains(currentNode)) {
39                return originalToClone[currentNode];
40            }
41          
42            // Create a new node with the same value as the original
43            Node* clonedNode = new Node(currentNode->val);
44          
45            // Store the mapping before processing neighbors to handle cycles
46            originalToClone[currentNode] = clonedNode;
47          
48            // Recursively clone all neighbors and add them to the cloned node
49            for (auto& neighbor : currentNode->neighbors) {
50                clonedNode->neighbors.push_back(dfs(neighbor));
51            }
52          
53            return clonedNode;
54        };
55      
56        // Start the DFS traversal from the given node
57        return dfs(node);
58    }
59};
60
1/**
2 * Definition for _Node.
3 * class _Node {
4 *     val: number
5 *     neighbors: _Node[]
6 *
7 *     constructor(val?: number, neighbors?: _Node[]) {
8 *         this.val = (val===undefined ? 0 : val)
9 *         this.neighbors = (neighbors===undefined ? [] : neighbors)
10 *     }
11 * }
12 */
13
14/**
15 * Clones an undirected graph using depth-first search
16 * @param node - The starting node of the graph to clone
17 * @returns A deep copy of the graph starting from the given node
18 */
19function cloneGraph(node: _Node | null): _Node | null {
20    // Map to store the mapping between original nodes and their clones
21    // This prevents infinite recursion and ensures each node is cloned only once
22    const visitedNodes: Map<_Node, _Node> = new Map();
23  
24    /**
25     * Recursively clones a node and all its neighbors using DFS
26     * @param currentNode - The node to clone
27     * @returns The cloned node
28     */
29    const deepClone = (currentNode: _Node | null): _Node | null => {
30        // Base case: if node is null, return null
31        if (!currentNode) {
32            return null;
33        }
34      
35        // If node has already been cloned, return the existing clone
36        if (visitedNodes.has(currentNode)) {
37            return visitedNodes.get(currentNode)!;
38        }
39      
40        // Create a new node with the same value as the original
41        const clonedNode = new _Node(currentNode.val);
42      
43        // Mark this node as visited by storing the mapping
44        visitedNodes.set(currentNode, clonedNode);
45      
46        // Recursively clone all neighbors and add them to the cloned node
47        for (const neighbor of currentNode.neighbors) {
48            clonedNode.neighbors.push(deepClone(neighbor)!);
49        }
50      
51        return clonedNode;
52    };
53  
54    // Start the cloning process from the given node
55    return deepClone(node);
56}
57

Time and Space Complexity

Time Complexity: O(n), where n is the number of nodes in the graph.

The algorithm uses depth-first search (DFS) to traverse the graph. Each node is visited exactly once due to the memoization check (if node in g). For each node visited, we:

  • Create a new node: O(1)
  • Store it in the dictionary: O(1)
  • Iterate through its neighbors: O(degree of node)

Since we visit each node once and process all edges exactly once (each edge is traversed when processing the neighbors), the total time complexity is O(n + e) where e is the number of edges. In a graph, the number of edges is proportional to the number of nodes, so this simplifies to O(n).

Space Complexity: O(n), where n is the number of nodes in the graph.

The space is used for:

  • The dictionary g that stores the mapping from original nodes to cloned nodes: O(n)
  • The recursion call stack in the worst case (for a chain-like graph): O(n)
  • The cloned graph itself with all nodes and edges: O(n + e), which is O(n)

Therefore, the overall space complexity is O(n).

Common Pitfalls

Pitfall 1: Storing the Mapping After Processing Neighbors

The Problem: A common mistake is to store the node mapping in the hash table after processing its neighbors, like this:

def dfs(node):
    if node is None:
        return None
    if node in visited:
        return visited[node]
  
    cloned = Node(node.val)
    # ❌ WRONG: Processing neighbors before storing the mapping
    for neighbor in node.neighbors:
        cloned.neighbors.append(dfs(neighbor))
  
    visited[node] = cloned  # Too late!
    return cloned

Why This Fails: In graphs with cycles, this leads to infinite recursion. Consider a simple two-node cycle where Node 1 connects to Node 2, and Node 2 connects back to Node 1:

  • When processing Node 1, we recursively call dfs(Node 2)
  • When processing Node 2, we recursively call dfs(Node 1)
  • Since we haven't stored Node 1 in the hash table yet, we start processing it again
  • This creates infinite recursion and causes a stack overflow

The Solution: Always store the mapping immediately after creating the cloned node and before processing neighbors:

def dfs(node):
    if node is None:
        return None
    if node in visited:
        return visited[node]
  
    cloned = Node(node.val)
    visited[node] = cloned  # βœ… CORRECT: Store immediately
  
    for neighbor in node.neighbors:
        cloned.neighbors.append(dfs(neighbor))
  
    return cloned

Pitfall 2: Creating New Nodes for Already Cloned Neighbors

The Problem: Another mistake is creating duplicate clones for the same node when it appears as a neighbor multiple times:

def dfs(node):
    if node is None:
        return None
  
    cloned = Node(node.val)
    # ❌ WRONG: Not checking if neighbors are already cloned
    for neighbor in node.neighbors:
        cloned.neighbors.append(Node(neighbor.val))  # Creates duplicates!
  
    return cloned

Why This Fails: This doesn't create a proper graph structure. If Node 3 is a neighbor of both Node 1 and Node 2, this approach would create two separate Node 3 clones instead of one shared node. The resulting structure wouldn't be a true copy of the original graph.

The Solution: Always use the hash table to check if a node has been cloned before creating a new one:

def dfs(node):
    if node is None:
        return None
    if node in visited:
        return visited[node]  # βœ… Return existing clone
  
    cloned = Node(node.val)
    visited[node] = cloned
  
    for neighbor in node.neighbors:
        cloned.neighbors.append(dfs(neighbor))  # βœ… Recursive call handles duplicates
  
    return cloned

Pitfall 3: Using Node Values as Keys Instead of Node References

The Problem: Using node values as keys in the hash table instead of node references:

visited = {}  # Maps values to cloned nodes

def dfs(node):
    if node is None:
        return None
    # ❌ WRONG: Using value as key
    if node.val in visited:
        return visited[node.val]
  
    cloned = Node(node.val)
    visited[node.val] = cloned  # Using value as key
  
    for neighbor in node.neighbors:
        cloned.neighbors.append(dfs(neighbor))
  
    return cloned

Why This Fails: While the problem states that node values are unique in the test cases, this approach:

  1. Doesn't work if the graph has duplicate values
  2. Makes the code less general and harder to reuse
  3. Doesn't properly track the relationship between original and cloned nodes

The Solution: Always use the actual node object reference as the key:

visited = {}  # Maps node references to cloned nodes

def dfs(node):
    if node is None:
        return None
    # βœ… CORRECT: Using node reference as key
    if node in visited:
        return visited[node]
  
    cloned = Node(node.val)
    visited[node] = cloned  # Using node reference as key
  
    for neighbor in node.neighbors:
        cloned.neighbors.append(dfs(neighbor))
  
    return cloned

This ensures each unique node in the original graph maps to exactly one cloned node, regardless of the node values.

Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Is the following code DFS or BFS?

void search(Node root) {
  if (!root) return;
  visit(root);
  root.visited = true;
  for (Node node in root.adjacent) {
    if (!node.visited) {
      search(node);
    }
  }
}

Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More