90. Subsets II
Problem Description
You are given an integer array nums
that may contain duplicate elements. Your task is to generate all possible subsets (the power set) of this array.
A subset is a collection of elements from the original array where you can choose to include or exclude each element. The power set is the set of all possible subsets, including the empty subset and the array itself.
The key constraint is that the solution set must not contain duplicate subsets. Since the input array may have duplicate elements, you need to handle them carefully to avoid generating the same subset multiple times.
For example:
- If
nums = [1,2,2]
, the output should be[[],[1],[1,2],[1,2,2],[2],[2,2]]
- Notice that we don't have two
[1,2]
subsets even though there are two 2's in the input
The solution works by:
- First sorting the array to group duplicate elements together
- Using a depth-first search (DFS) approach where at each position, we decide whether to include the current element or not
- When we choose not to include an element, we skip all duplicates of that element to avoid creating duplicate subsets
- The clever part is in the skipping logic: after removing an element from our current subset, we check if subsequent elements are the same and skip them all before making the next recursive call
The time complexity is O(n × 2^n)
where n
is the length of the array, as we potentially generate 2^n
subsets and each subset takes O(n)
time to copy into the result.
Flowchart Walkthrough
First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:
Is it a graph?
- No: The problem is about generating subsets from an array, not about nodes and edges in a graph structure.
Need to solve for kth smallest/largest?
- No: We're not looking for a specific kth element; we need to generate all possible subsets.
Involves Linked Lists?
- No: The problem works with an array of integers, not linked list nodes.
Does the problem have small constraints?
- Yes: Generating all subsets means we need to explore 2^n possibilities. This is typically feasible only for small values of n (usually n ≤ 20). The problem involves generating all possible combinations of including/excluding elements.
Brute force / Backtracking?
- Yes: We need to systematically explore all possibilities - for each element, we decide whether to include it in the current subset or not. This is a classic backtracking pattern where we:
- Make a choice (include the element)
- Explore further possibilities (recurse)
- Undo the choice (backtrack by removing the element)
- Make an alternative choice (skip the element and its duplicates)
Conclusion: The flowchart correctly leads us to use a Backtracking approach. The problem requires generating all possible subsets while handling duplicates, which is perfectly suited for backtracking where we can make decisions at each step and backtrack when needed. The sorting step helps us identify and skip duplicate elements efficiently during the backtracking process.
Intuition
When generating subsets, we face a binary decision for each element: include it or exclude it. This naturally leads to a tree-like exploration where each path from root to leaf represents a unique subset.
The challenge with duplicate elements is that they can create identical subsets through different paths. For instance, with [1,2,2]
, if we treat the two 2's as distinct, we might generate [1,2]
twice - once using the first 2 and once using the second 2.
The key insight is that we should treat consecutive duplicate elements as a group. When we decide not to include a duplicate element, we should skip all its duplicates at once. This prevents us from creating the same subset multiple times.
Here's how we arrive at this approach:
-
Sort the array first - This groups duplicate elements together, making them easy to identify and handle as a unit.
-
Use a decision tree model - At each position, we branch into two possibilities:
- Include the current element and explore further
- Exclude the current element and explore further
-
The crucial optimization for duplicates - When we choose NOT to include an element, we don't just skip that single element. Instead, we skip all consecutive duplicates of it. Why? Because if we've decided not to include a '2' at this decision point, including any subsequent '2' would create a subset we've already generated through a different path.
-
The backtracking pattern emerges naturally - We build subsets incrementally, adding elements as we go deeper into recursion, then removing them (backtracking) to explore alternative paths.
Think of it like this: if you have three 2's in your array and you decide not to include a 2 at a certain decision point, you must skip all remaining 2's at that level. Otherwise, you'd be making the same subset through different combinations of which specific 2's you include, leading to duplicates in your result.
Learn more about Backtracking patterns.
Solution Approach
Following our intuition, let's implement the backtracking solution step by step:
1. Sort the input array
nums.sort()
Sorting groups duplicate elements together, making it easy to identify and skip them when needed.
2. Set up tracking variables
ans = []
- stores all the valid subsetst = []
- temporary list that builds the current subset as we exploredfs(i)
- recursive function that explores subsets starting from indexi
3. Implement the DFS backtracking logic
The dfs(i)
function works as follows:
Base case: When i == len(nums)
, we've made decisions for all elements. Add a copy of the current subset t
to our answer:
if i == len(nums):
ans.append(t[:]) # Use t[:] to create a copy
return
Recursive exploration with two branches:
Branch 1 - Include the current element:
t.append(nums[i]) # Add current element to subset dfs(i + 1) # Explore with this element included
Branch 2 - Exclude the current element and its duplicates:
x = t.pop() # Remove the element we just added (backtrack)
while i + 1 < len(nums) and nums[i + 1] == x:
i += 1 # Skip all duplicates of the current element
dfs(i + 1) # Explore without this element and its duplicates
4. The critical duplicate-handling mechanism
The key innovation is in the second branch. After deciding not to include nums[i]
, we don't just move to i+1
. Instead, we skip all elements equal to nums[i]
:
- We save the popped element in variable
x
- We increment
i
while the next element equalsx
- This ensures we skip all duplicates in one go
5. Why this works
Consider [1,2,2,3]
. When we're at the first 2 (index 1):
- If we include it and recurse, we'll explore subsets with at least one 2
- If we exclude it, the while loop skips to index 3, bypassing both 2's entirely
- This prevents generating
[1,3]
twice (once by skipping the first 2, once by skipping the second 2)
The algorithm explores a decision tree where each level represents a position in the sorted array, and the branching factor is effectively 2 (include or skip group). The time complexity is O(n × 2^n)
as we generate up to 2^n
subsets, each taking O(n)
time to copy into the result.
Ready to land your dream job?
Unlock your dream job with a 3-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through the algorithm with nums = [1,2,2]
:
Step 1: Sort the array
nums = [1,2,2]
(already sorted)
Step 2: Initialize variables
ans = []
(will store all subsets)t = []
(current subset being built)- Start with
dfs(0)
Step 3: Trace through the DFS exploration
dfs(0): i=0, t=[] ├─ Include nums[0]=1: t=[1] │ └─ dfs(1): i=1, t=[1] │ ├─ Include nums[1]=2: t=[1,2] │ │ └─ dfs(2): i=2, t=[1,2] │ │ ├─ Include nums[2]=2: t=[1,2,2] │ │ │ └─ dfs(3): i=3, t=[1,2,2] │ │ │ └─ Base case: add [1,2,2] to ans │ │ └─ Exclude nums[2]=2: t=[1,2] (pop 2) │ │ └─ No duplicates to skip │ │ └─ dfs(3): i=3, t=[1,2] │ │ └─ Base case: add [1,2] to ans │ └─ Exclude nums[1]=2: t=[1] (pop 2) │ └─ Skip duplicate: nums[2]=2, so i moves to 2 │ └─ dfs(3): i=3, t=[1] │ └─ Base case: add [1] to ans └─ Exclude nums[0]=1: t=[] (pop 1) └─ No duplicates to skip └─ dfs(1): i=1, t=[] ├─ Include nums[1]=2: t=[2] │ └─ dfs(2): i=2, t=[2] │ ├─ Include nums[2]=2: t=[2,2] │ │ └─ dfs(3): i=3, t=[2,2] │ │ └─ Base case: add [2,2] to ans │ └─ Exclude nums[2]=2: t=[2] (pop 2) │ └─ No duplicates to skip │ └─ dfs(3): i=3, t=[2] │ └─ Base case: add [2] to ans └─ Exclude nums[1]=2: t=[] (pop 2) └─ Skip duplicate: nums[2]=2, so i moves to 2 └─ dfs(3): i=3, t=[] └─ Base case: add [] to ans
Key observations:
- When we exclude the first 2 at index 1, we skip to index 3 (past both 2's)
- This prevents generating duplicate subsets like having two
[1]
or two[]
- Final result:
ans = [[1,2,2], [1,2], [1], [2,2], [2], []]
The duplicate-skipping mechanism is crucial: after excluding a 2, we skip ALL consecutive 2's to avoid creating the same subset through different paths.
Solution Implementation
1class Solution:
2 def subsetsWithDup(self, nums: List[int]) -> List[List[int]]:
3 def backtrack(index: int):
4 # Base case: if we've processed all elements
5 if index == len(nums):
6 result.append(current_subset[:]) # Add a copy of current subset
7 return
8
9 # Choice 1: Include the current element
10 current_subset.append(nums[index])
11 backtrack(index + 1)
12
13 # Backtrack: remove the element we just added
14 removed_element = current_subset.pop()
15
16 # Skip all duplicates of the removed element
17 # This ensures we don't create duplicate subsets
18 while index + 1 < len(nums) and nums[index + 1] == removed_element:
19 index += 1
20
21 # Choice 2: Exclude the current element (and all its duplicates)
22 backtrack(index + 1)
23
24 # Sort the array to group duplicates together
25 nums.sort()
26
27 # Initialize result list and current subset being built
28 result = []
29 current_subset = []
30
31 # Start the backtracking process from index 0
32 backtrack(0)
33
34 return result
35
1class Solution {
2 // Store all unique subsets
3 private List<List<Integer>> result = new ArrayList<>();
4 // Store current subset being built
5 private List<Integer> currentSubset = new ArrayList<>();
6 // Input array
7 private int[] nums;
8
9 /**
10 * Find all possible subsets of the given array, excluding duplicates.
11 * @param nums - input array that may contain duplicate elements
12 * @return list of all unique subsets
13 */
14 public List<List<Integer>> subsetsWithDup(int[] nums) {
15 // Sort array to group duplicate elements together
16 Arrays.sort(nums);
17 this.nums = nums;
18 // Start backtracking from index 0
19 backtrack(0);
20 return result;
21 }
22
23 /**
24 * Backtracking helper method to generate all unique subsets.
25 * @param index - current position in the nums array
26 */
27 private void backtrack(int index) {
28 // Base case: reached end of array, add current subset to result
29 if (index >= nums.length) {
30 result.add(new ArrayList<>(currentSubset));
31 return;
32 }
33
34 // Include current element in subset
35 currentSubset.add(nums[index]);
36 backtrack(index + 1);
37
38 // Backtrack: remove the element we just added
39 int removedElement = currentSubset.remove(currentSubset.size() - 1);
40
41 // Skip all duplicate occurrences of the removed element
42 // This ensures we don't create duplicate subsets
43 while (index + 1 < nums.length && nums[index + 1] == removedElement) {
44 index++;
45 }
46
47 // Explore subsets without including the current element (and its duplicates)
48 backtrack(index + 1);
49 }
50}
51
1class Solution {
2public:
3 vector<vector<int>> subsetsWithDup(vector<int>& nums) {
4 // Sort the array to group duplicate elements together
5 sort(nums.begin(), nums.end());
6
7 vector<vector<int>> result;
8 vector<int> currentSubset;
9 int n = nums.size();
10
11 // Define recursive function to generate all subsets
12 function<void(int)> backtrack = [&](int index) {
13 // Base case: if we've processed all elements, add current subset to result
14 if (index >= n) {
15 result.push_back(currentSubset);
16 return;
17 }
18
19 // Choice 1: Include the current element in the subset
20 currentSubset.push_back(nums[index]);
21 backtrack(index + 1);
22 currentSubset.pop_back();
23
24 // Skip all duplicate elements to avoid generating duplicate subsets
25 // This ensures we only process each unique element value once at this decision point
26 while (index + 1 < n && nums[index + 1] == nums[index]) {
27 index++;
28 }
29
30 // Choice 2: Exclude the current element (and all its duplicates) from the subset
31 backtrack(index + 1);
32 };
33
34 // Start the backtracking process from index 0
35 backtrack(0);
36
37 return result;
38 }
39};
40
1/**
2 * Generates all possible subsets of an array containing duplicates
3 * @param nums - Input array that may contain duplicate elements
4 * @returns Array of all possible subsets without duplicate subsets
5 */
6function subsetsWithDup(nums: number[]): number[][] {
7 // Sort the array to group duplicate elements together
8 nums.sort((a, b) => a - b);
9
10 const arrayLength: number = nums.length;
11 const currentSubset: number[] = [];
12 const allSubsets: number[][] = [];
13
14 /**
15 * Depth-first search to generate all subsets
16 * @param currentIndex - Current position in the nums array
17 */
18 const generateSubsets = (currentIndex: number): void => {
19 // Base case: reached the end of the array
20 if (currentIndex >= arrayLength) {
21 // Add a copy of the current subset to the result
22 allSubsets.push([...currentSubset]);
23 return;
24 }
25
26 // Choice 1: Include the current element in the subset
27 currentSubset.push(nums[currentIndex]);
28 generateSubsets(currentIndex + 1);
29 currentSubset.pop();
30
31 // Skip all duplicate elements to avoid generating duplicate subsets
32 // This ensures we only process each unique element value once at this decision point
33 while (currentIndex + 1 < arrayLength && nums[currentIndex] === nums[currentIndex + 1]) {
34 currentIndex++;
35 }
36
37 // Choice 2: Exclude the current element (and all its duplicates) from the subset
38 generateSubsets(currentIndex + 1);
39 };
40
41 // Start the DFS from index 0
42 generateSubsets(0);
43
44 return allSubsets;
45}
46
Time and Space Complexity
Time Complexity: O(n × 2^n)
The algorithm generates all possible subsets of the input array while handling duplicates. After sorting the array in O(n log n)
time, the DFS explores a decision tree where at each element we decide whether to include it or not. For arrays with unique elements, this would create exactly 2^n
subsets. Even with duplicates, the algorithm still visits each valid subset exactly once by skipping consecutive duplicate elements when backtracking. Each subset requires O(n)
time to copy into the result list (via t[:]
). Therefore, the total time complexity is dominated by O(n × 2^n)
for generating and copying all subsets.
Space Complexity: O(n)
The space complexity considers only the auxiliary space used by the algorithm, excluding the output. The recursive call stack can go as deep as n
levels (one for each element in the array). The temporary list t
can hold at most n
elements at any point. These two factors contribute O(n)
space each. Since they don't accumulate independently (the maximum size of t
corresponds to the deepest recursion level), the total auxiliary space complexity is O(n)
.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. Incorrect Duplicate Skipping Logic
The Pitfall:
A common mistake is trying to skip duplicates using a condition like if i > 0 and nums[i] == nums[i-1]: continue
inside the recursive function. This approach fails because it doesn't distinguish between:
- Using the first occurrence of a duplicate element
- Skipping a duplicate after already deciding not to use the previous occurrence
Incorrect Implementation:
def backtrack(index):
if index == len(nums):
result.append(current_subset[:])
return
# WRONG: This skips duplicates unconditionally
if index > 0 and nums[index] == nums[index - 1]:
backtrack(index + 1)
return
# Include current element
current_subset.append(nums[index])
backtrack(index + 1)
current_subset.pop()
# Exclude current element
backtrack(index + 1)
This fails for [1,2,2]
because it would skip the second 2 entirely, missing subsets like [2,2]
and [1,2,2]
.
The Solution: The correct approach only skips duplicates in the "exclude" branch, after we've already explored including the element. The key insight is that we explore all ways to include duplicates first, then skip them all at once when excluding.
2. Forgetting to Sort the Array
The Pitfall: Without sorting, duplicate elements might be scattered throughout the array, making it impossible to identify and skip duplicate groups efficiently.
Example of Failure:
For nums = [2,1,2]
without sorting:
- We'd generate
[2,1]
and[1,2]
as different subsets (they're the same set) - The duplicate-skipping logic wouldn't work since duplicates aren't adjacent
The Solution:
Always sort the array first: nums.sort()
. This groups identical elements together, enabling the duplicate-skipping mechanism to work correctly.
3. Modifying the Index Variable Incorrectly
The Pitfall: In Python, when you modify a parameter inside a function, it only affects the local scope. Some might write:
# WRONG: This doesn't affect the recursive call properly
while index + 1 < len(nums) and nums[index + 1] == removed_element:
index += 1
backtrack(index + 1) # This uses the modified index
While this actually works in the given solution, a cleaner approach would be:
The Solution: Calculate the next valid index and pass it directly:
# Calculate next index after skipping duplicates
next_index = index + 1
while next_index < len(nums) and nums[next_index] == removed_element:
next_index += 1
backtrack(next_index)
4. Not Creating a Copy of the Current Subset
The Pitfall: Adding the current subset directly to the result without copying:
# WRONG: This adds a reference to the same list object result.append(current_subset) # All entries will point to the same list
Since current_subset
is modified throughout the recursion, all entries in result
would end up pointing to the same list object, which would be empty at the end.
The Solution: Always create a copy when adding to the result:
result.append(current_subset[:]) # Creates a new list with current elements
# or
result.append(list(current_subset)) # Also creates a copy
Problem: Given a list of tasks and a list of requirements, compute a sequence of tasks that can be performed, such that we complete every task once while satisfying all the requirements.
Which of the following method should we use to solve this problem?
Recommended Readings
Backtracking Template Prereq DFS with States problems dfs_with_states Combinatorial search problems Combinatorial search problems involve finding groupings and assignments of objects that satisfy certain conditions Finding all permutations combinations subsets and solving Sudoku are classic combinatorial problems The time complexity of combinatorial problems often grows rapidly with the size of
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Want a Structured Path to Master System Design Too? Don’t Miss This!