133. Clone Graph
Problem Description
You are given a reference to a node in a connected undirected graph. Your task is to create a deep copy (clone) of the entire graph and return the reference to the copied version of the given node.
Each node in the graph has two properties:
val
: an integer valueneighbors
: a list of references to its neighbor nodes
The graph structure is defined as:
class Node {
public int val;
public List<Node> neighbors;
}
Important details:
- The graph is connected, meaning you can reach any node from any other node
- The graph is undirected, so if node A has node B as a neighbor, then node B also has node A as a neighbor
- For simplicity in test cases, each node's value equals its index (1-indexed). The first node has
val = 1
, the second hasval = 2
, and so on - The input node will always be the first node with
val = 1
- If the input is
null
, returnnull
What is a deep copy?
A deep copy means creating entirely new node objects with the same values and the same neighbor relationships as the original graph. The cloned graph should be completely independent of the original - no nodes in the cloned graph should reference any nodes in the original graph.
Example:
If you have a graph with 4 nodes where:
- Node 1 connects to nodes 2 and 4
- Node 2 connects to nodes 1 and 3
- Node 3 connects to nodes 2 and 4
- Node 4 connects to nodes 1 and 3
Your function should create 4 new node objects with the same connections, and return the reference to the new Node 1.
Flowchart Walkthrough
First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:
Is it a graph?
- Yes: The problem explicitly states we're working with an undirected graph where nodes have neighbors, which represents the classic graph structure with nodes and edges.
Is it a tree?
- No: The problem describes an undirected graph that can have cycles. Trees are acyclic connected graphs, but this problem allows for cycles (nodes can have multiple paths between them).
Is the problem related to directed acyclic graphs?
- No: The graph is undirected, not directed, and it may contain cycles.
Is the problem related to shortest paths?
- No: We're not finding shortest paths between nodes. We need to create a complete copy of the graph structure.
Does the problem involve connectivity?
- No: While the graph is connected, the problem isn't about finding connected components or checking connectivity. It's about traversing and copying the entire graph structure.
Does the problem have small constraints?
- Yes: Graph cloning typically involves graphs with reasonable sizes that can be handled with recursive approaches. The problem doesn't specify huge constraints that would require optimization.
DFS/backtracking?
- Yes: We need to traverse the entire graph to create a deep copy, visiting each node and its neighbors. DFS is perfect for this as we need to explore each path completely to clone all connections.
Conclusion: The flowchart suggests using DFS (Depth-First Search) for cloning the graph. DFS allows us to traverse the graph recursively, creating new nodes as we go and maintaining a mapping between original and cloned nodes to handle cycles and already-visited nodes.
Intuition
When we need to create a deep copy of a graph, the main challenge is handling the graph's interconnected nature. Unlike copying a simple array or linked list where we move linearly, a graph has nodes that can reference each other in complex ways, including cycles.
Think about what happens when we try to copy a node: we need to create a new node with the same value, but we also need to copy all its neighbors. However, when we try to copy the neighbors, we might encounter nodes we've already seen or even the original node itself (in case of cycles). This creates two key problems:
- Avoiding infinite loops: If node A has neighbor B, and B has neighbor A, we could get stuck copying A β B β A β B forever
- Maintaining correct references: We need to ensure that the cloned nodes point to other cloned nodes, not to the original nodes
The solution is to use a hash table to track which nodes we've already cloned. The hash table maps each original node to its corresponding cloned node. This serves two purposes:
- It tells us if we've already visited and cloned a node (preventing infinite loops)
- It gives us quick access to the cloned version of any node we've already processed
With this tracking mechanism in place, we can use DFS to traverse the graph:
- Start with the given node
- If we've already cloned this node (it exists in our hash table), return the clone
- If not, create a new node with the same value
- Store the mapping (original β clone) in our hash table immediately to handle cycles
- Recursively clone all neighbors and add them to the new node's neighbor list
- Return the cloned node
The key insight is that we must add the node to our hash table before we process its neighbors. This way, if we encounter this node again while processing its neighbors (due to cycles), we can immediately return the already-created clone instead of creating duplicates or falling into infinite recursion.
This approach ensures that each node in the original graph is cloned exactly once, and all the relationships between nodes are preserved in the cloned graph.
Solution Approach
The implementation uses a hash table combined with DFS to create a deep copy of the graph. Let's break down the solution step by step:
Data Structure:
- We use a hash table
g
(implemented as adefaultdict
) to store the mapping between original nodes and their cloned counterparts - Key: original node reference
- Value: cloned node reference
DFS Function Implementation:
The core of the solution is the recursive dfs(node)
function that returns the cloned version of the input node:
-
Base Case - Null Check:
if node is None: return None
If the input node is
null
, we returnnull
immediately. -
Check if Already Cloned:
if node in g: return g[node]
Before creating a new node, we check if we've already cloned this node. If yes, we return the existing clone from the hash table. This prevents infinite recursion when encountering cycles.
-
Create Clone and Store Mapping:
cloned = Node(node.val) g[node] = cloned
We create a new node with the same value as the original and immediately store the mapping in our hash table. This is crucial - we must store the mapping before processing neighbors to handle cycles correctly.
-
Recursively Clone Neighbors:
for nxt in node.neighbors: cloned.neighbors.append(dfs(nxt))
For each neighbor of the original node, we recursively call
dfs
to get its clone and add it to the cloned node's neighbor list. The recursive call will either:- Return an already cloned node if we've seen it before
- Create a new clone if it's a new node
-
Return the Cloned Node:
return cloned
Main Function:
The main function simply initializes an empty hash table and calls dfs
with the input node:
g = defaultdict() return dfs(node)
Time Complexity: O(N + E)
where N
is the number of nodes and E
is the number of edges. We visit each node once and traverse each edge once.
Space Complexity: O(N)
for the hash table storing the mapping and the recursion stack in the worst case (when the graph is a long chain).
The beauty of this approach is that it handles all the complexities of graph cloning - cycles, self-loops, and multiple references to the same node - elegantly through the combination of memoization (hash table) and recursive traversal (DFS).
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through cloning a simple graph with 3 nodes to illustrate the solution approach:
Original Graph:
- Node 1: neighbors = [2, 3]
- Node 2: neighbors = [1, 3]
- Node 3: neighbors = [1, 2]
This forms a triangle where all nodes are connected to each other.
Step-by-step execution:
-
Start with Node 1
- Check hash table
g
: Node 1 not found - Create
clone1
with val=1 - Store in hash table:
g[node1] = clone1
- Process neighbors: [2, 3]
- Check hash table
-
Recursively process Node 2 (first neighbor of Node 1)
- Check hash table: Node 2 not found
- Create
clone2
with val=2 - Store in hash table:
g[node2] = clone2
- Process Node 2's neighbors: [1, 3]
-
Process Node 1 (first neighbor of Node 2)
- Check hash table: Node 1 found! (we stored it in step 1)
- Return
clone1
from hash table - Add
clone1
toclone2.neighbors
-
Process Node 3 (second neighbor of Node 2)
- Check hash table: Node 3 not found
- Create
clone3
with val=3 - Store in hash table:
g[node3] = clone3
- Process Node 3's neighbors: [1, 2]
-
Process Node 1 (first neighbor of Node 3)
- Check hash table: Node 1 found!
- Return
clone1
from hash table - Add
clone1
toclone3.neighbors
-
Process Node 2 (second neighbor of Node 3)
- Check hash table: Node 2 found!
- Return
clone2
from hash table - Add
clone2
toclone3.neighbors
- Return
clone3
back to step 4
-
Back to step 4
- Add
clone3
toclone2.neighbors
- Return
clone2
back to step 2
- Add
-
Back to step 2
- Add
clone2
toclone1.neighbors
- Add
-
Process Node 3 (second neighbor of Node 1)
- Check hash table: Node 3 found! (created in step 4)
- Return
clone3
from hash table - Add
clone3
toclone1.neighbors
-
Complete
- Return
clone1
- Return
Final Result:
clone1
: val=1, neighbors=[clone2, clone3]clone2
: val=2, neighbors=[clone1, clone3]clone3
: val=3, neighbors=[clone1, clone2]
The key insight: When we encounter a node we've already cloned (like Node 1 in step 3), we immediately return the existing clone from the hash table instead of creating a duplicate or falling into infinite recursion. This is why storing the mapping in the hash table before processing neighbors is crucial.
Solution Implementation
1"""
2# Definition for a Node.
3class Node:
4 def __init__(self, val = 0, neighbors = None):
5 self.val = val
6 self.neighbors = neighbors if neighbors is not None else []
7"""
8
9from typing import Optional
10
11
12class Solution:
13 def cloneGraph(self, node: Optional["Node"]) -> Optional["Node"]:
14 """
15 Clone an undirected graph using depth-first search.
16
17 Args:
18 node: The starting node of the graph to clone
19
20 Returns:
21 The starting node of the cloned graph
22 """
23
24 def dfs(current_node: Optional["Node"]) -> Optional["Node"]:
25 """
26 Recursively clone nodes using DFS traversal.
27
28 Args:
29 current_node: The node being processed
30
31 Returns:
32 The cloned version of current_node
33 """
34 # Base case: if node is None, return None
35 if current_node is None:
36 return None
37
38 # If node has already been cloned, return the cloned version
39 if current_node in visited_to_cloned:
40 return visited_to_cloned[current_node]
41
42 # Create a new node with the same value
43 cloned_node = Node(current_node.val)
44
45 # Mark this node as visited by storing the mapping
46 visited_to_cloned[current_node] = cloned_node
47
48 # Recursively clone all neighbors
49 for neighbor in current_node.neighbors:
50 cloned_node.neighbors.append(dfs(neighbor))
51
52 return cloned_node
53
54 # Dictionary to map original nodes to their cloned counterparts
55 # This prevents infinite recursion and ensures each node is cloned only once
56 visited_to_cloned = {}
57
58 # Start the DFS traversal from the given node
59 return dfs(node)
60
1/*
2// Definition for a Node.
3class Node {
4 public int val;
5 public List<Node> neighbors;
6 public Node() {
7 val = 0;
8 neighbors = new ArrayList<Node>();
9 }
10 public Node(int _val) {
11 val = _val;
12 neighbors = new ArrayList<Node>();
13 }
14 public Node(int _val, ArrayList<Node> _neighbors) {
15 val = _val;
16 neighbors = _neighbors;
17 }
18}
19*/
20
21class Solution {
22 // HashMap to store the mapping from original nodes to cloned nodes
23 // This prevents infinite loops and ensures each node is cloned only once
24 private Map<Node, Node> visitedNodes = new HashMap<>();
25
26 /**
27 * Clones an undirected graph given a reference node
28 * @param node The starting node of the graph to be cloned
29 * @return The starting node of the cloned graph
30 */
31 public Node cloneGraph(Node node) {
32 return deepFirstSearch(node);
33 }
34
35 /**
36 * Performs depth-first search to clone the graph recursively
37 * @param node The current node being processed
38 * @return The cloned version of the current node
39 */
40 private Node deepFirstSearch(Node node) {
41 // Base case: if node is null, return null
42 if (node == null) {
43 return null;
44 }
45
46 // Check if this node has already been cloned
47 Node clonedNode = visitedNodes.get(node);
48
49 // If the node hasn't been cloned yet, create a new clone
50 if (clonedNode == null) {
51 // Create a new node with the same value as the original
52 clonedNode = new Node(node.val);
53
54 // Mark this node as visited by storing the mapping
55 visitedNodes.put(node, clonedNode);
56
57 // Recursively clone all neighbors and add them to the cloned node
58 for (Node neighbor : node.neighbors) {
59 clonedNode.neighbors.add(deepFirstSearch(neighbor));
60 }
61 }
62
63 // Return the cloned node
64 return clonedNode;
65 }
66}
67
1/*
2// Definition for a Node.
3class Node {
4public:
5 int val;
6 vector<Node*> neighbors;
7 Node() {
8 val = 0;
9 neighbors = vector<Node*>();
10 }
11 Node(int _val) {
12 val = _val;
13 neighbors = vector<Node*>();
14 }
15 Node(int _val, vector<Node*> _neighbors) {
16 val = _val;
17 neighbors = _neighbors;
18 }
19};
20*/
21
22class Solution {
23public:
24 Node* cloneGraph(Node* node) {
25 // Map to store the mapping from original nodes to cloned nodes
26 // This prevents infinite recursion and ensures each node is cloned only once
27 unordered_map<Node*, Node*> originalToClone;
28
29 // Recursive DFS function to clone the graph
30 auto dfs = [&](this auto&& dfs, Node* currentNode) -> Node* {
31 // Base case: if node is null, return null
32 if (!currentNode) {
33 return nullptr;
34 }
35
36 // If this node has already been cloned, return the cloned version
37 // This handles cycles in the graph
38 if (originalToClone.contains(currentNode)) {
39 return originalToClone[currentNode];
40 }
41
42 // Create a new node with the same value as the original
43 Node* clonedNode = new Node(currentNode->val);
44
45 // Store the mapping before processing neighbors to handle cycles
46 originalToClone[currentNode] = clonedNode;
47
48 // Recursively clone all neighbors and add them to the cloned node
49 for (auto& neighbor : currentNode->neighbors) {
50 clonedNode->neighbors.push_back(dfs(neighbor));
51 }
52
53 return clonedNode;
54 };
55
56 // Start the DFS traversal from the given node
57 return dfs(node);
58 }
59};
60
1/**
2 * Definition for _Node.
3 * class _Node {
4 * val: number
5 * neighbors: _Node[]
6 *
7 * constructor(val?: number, neighbors?: _Node[]) {
8 * this.val = (val===undefined ? 0 : val)
9 * this.neighbors = (neighbors===undefined ? [] : neighbors)
10 * }
11 * }
12 */
13
14/**
15 * Clones an undirected graph using depth-first search
16 * @param node - The starting node of the graph to clone
17 * @returns A deep copy of the graph starting from the given node
18 */
19function cloneGraph(node: _Node | null): _Node | null {
20 // Map to store the mapping between original nodes and their clones
21 // This prevents infinite recursion and ensures each node is cloned only once
22 const visitedNodes: Map<_Node, _Node> = new Map();
23
24 /**
25 * Recursively clones a node and all its neighbors using DFS
26 * @param currentNode - The node to clone
27 * @returns The cloned node
28 */
29 const deepClone = (currentNode: _Node | null): _Node | null => {
30 // Base case: if node is null, return null
31 if (!currentNode) {
32 return null;
33 }
34
35 // If node has already been cloned, return the existing clone
36 if (visitedNodes.has(currentNode)) {
37 return visitedNodes.get(currentNode)!;
38 }
39
40 // Create a new node with the same value as the original
41 const clonedNode = new _Node(currentNode.val);
42
43 // Mark this node as visited by storing the mapping
44 visitedNodes.set(currentNode, clonedNode);
45
46 // Recursively clone all neighbors and add them to the cloned node
47 for (const neighbor of currentNode.neighbors) {
48 clonedNode.neighbors.push(deepClone(neighbor)!);
49 }
50
51 return clonedNode;
52 };
53
54 // Start the cloning process from the given node
55 return deepClone(node);
56}
57
Time and Space Complexity
Time Complexity: O(n)
, where n
is the number of nodes in the graph.
The algorithm uses depth-first search (DFS) to traverse the graph. Each node is visited exactly once due to the memoization check (if node in g
). For each node visited, we:
- Create a new node:
O(1)
- Store it in the dictionary:
O(1)
- Iterate through its neighbors:
O(degree of node)
Since we visit each node once and process all edges exactly once (each edge is traversed when processing the neighbors), the total time complexity is O(n + e)
where e
is the number of edges. In a graph, the number of edges is proportional to the number of nodes, so this simplifies to O(n)
.
Space Complexity: O(n)
, where n
is the number of nodes in the graph.
The space is used for:
- The dictionary
g
that stores the mapping from original nodes to cloned nodes:O(n)
- The recursion call stack in the worst case (for a chain-like graph):
O(n)
- The cloned graph itself with all nodes and edges:
O(n + e)
, which isO(n)
Therefore, the overall space complexity is O(n)
.
Common Pitfalls
Pitfall 1: Storing the Mapping After Processing Neighbors
The Problem: A common mistake is to store the node mapping in the hash table after processing its neighbors, like this:
def dfs(node):
if node is None:
return None
if node in visited:
return visited[node]
cloned = Node(node.val)
# β WRONG: Processing neighbors before storing the mapping
for neighbor in node.neighbors:
cloned.neighbors.append(dfs(neighbor))
visited[node] = cloned # Too late!
return cloned
Why This Fails: In graphs with cycles, this leads to infinite recursion. Consider a simple two-node cycle where Node 1 connects to Node 2, and Node 2 connects back to Node 1:
- When processing Node 1, we recursively call
dfs(Node 2)
- When processing Node 2, we recursively call
dfs(Node 1)
- Since we haven't stored Node 1 in the hash table yet, we start processing it again
- This creates infinite recursion and causes a stack overflow
The Solution: Always store the mapping immediately after creating the cloned node and before processing neighbors:
def dfs(node):
if node is None:
return None
if node in visited:
return visited[node]
cloned = Node(node.val)
visited[node] = cloned # β
CORRECT: Store immediately
for neighbor in node.neighbors:
cloned.neighbors.append(dfs(neighbor))
return cloned
Pitfall 2: Creating New Nodes for Already Cloned Neighbors
The Problem: Another mistake is creating duplicate clones for the same node when it appears as a neighbor multiple times:
def dfs(node):
if node is None:
return None
cloned = Node(node.val)
# β WRONG: Not checking if neighbors are already cloned
for neighbor in node.neighbors:
cloned.neighbors.append(Node(neighbor.val)) # Creates duplicates!
return cloned
Why This Fails: This doesn't create a proper graph structure. If Node 3 is a neighbor of both Node 1 and Node 2, this approach would create two separate Node 3 clones instead of one shared node. The resulting structure wouldn't be a true copy of the original graph.
The Solution: Always use the hash table to check if a node has been cloned before creating a new one:
def dfs(node):
if node is None:
return None
if node in visited:
return visited[node] # β
Return existing clone
cloned = Node(node.val)
visited[node] = cloned
for neighbor in node.neighbors:
cloned.neighbors.append(dfs(neighbor)) # β
Recursive call handles duplicates
return cloned
Pitfall 3: Using Node Values as Keys Instead of Node References
The Problem: Using node values as keys in the hash table instead of node references:
visited = {} # Maps values to cloned nodes
def dfs(node):
if node is None:
return None
# β WRONG: Using value as key
if node.val in visited:
return visited[node.val]
cloned = Node(node.val)
visited[node.val] = cloned # Using value as key
for neighbor in node.neighbors:
cloned.neighbors.append(dfs(neighbor))
return cloned
Why This Fails: While the problem states that node values are unique in the test cases, this approach:
- Doesn't work if the graph has duplicate values
- Makes the code less general and harder to reuse
- Doesn't properly track the relationship between original and cloned nodes
The Solution: Always use the actual node object reference as the key:
visited = {} # Maps node references to cloned nodes
def dfs(node):
if node is None:
return None
# β
CORRECT: Using node reference as key
if node in visited:
return visited[node]
cloned = Node(node.val)
visited[node] = cloned # Using node reference as key
for neighbor in node.neighbors:
cloned.neighbors.append(dfs(neighbor))
return cloned
This ensures each unique node in the original graph maps to exactly one cloned node, regardless of the node values.
Is the following code DFS or BFS?
void search(Node root) { if (!root) return; visit(root); root.visited = true; for (Node node in root.adjacent) { if (!node.visited) { search(node); } } }
Recommended Readings
https assets algo monster cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS As the name suggests
https assets algo monster cover_photos bfs svg Breadth First Search on Trees Hopefully by this time you've drunk enough DFS Kool Aid to understand its immense power and seen enough visualization to create a call stack in your mind Now let me introduce the companion spell Breadth First Search BFS
https assets algo monster cover_photos graph svg Graph Fundamentals Tree with 0 cycle At this point you should be pretty familiar with trees A tree is a special kind of graph a connected acyclic cycle less graph A graph may contain cycle s and nodes could be disconnected A tree
Want a Structured Path to Master System Design Too? Donβt Miss This!