Facebook Pixel

90. Subsets II

MediumBit ManipulationArrayBacktracking
Leetcode Link

Problem Description

You are given an integer array nums that may contain duplicate elements. Your task is to generate all possible subsets (the power set) of this array.

A subset is a collection of elements from the original array where you can choose to include or exclude each element. The power set is the set of all possible subsets, including the empty subset and the array itself.

The key constraint is that the solution set must not contain duplicate subsets. Since the input array may have duplicate elements, you need to handle them carefully to avoid generating the same subset multiple times.

For example:

  • If nums = [1,2,2], the output should be [[],[1],[1,2],[1,2,2],[2],[2,2]]
  • Notice that we don't have two [1,2] subsets even though there are two 2's in the input

The solution works by:

  1. First sorting the array to group duplicate elements together
  2. Using a depth-first search (DFS) approach where at each position, we decide whether to include the current element or not
  3. When we choose not to include an element, we skip all duplicates of that element to avoid creating duplicate subsets
  4. The clever part is in the skipping logic: after removing an element from our current subset, we check if subsequent elements are the same and skip them all before making the next recursive call

The time complexity is O(n × 2^n) where n is the length of the array, as we potentially generate 2^n subsets and each subset takes O(n) time to copy into the result.

Flowchart Walkthrough

First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:

Is it a graph?

  • No: The problem is about generating subsets from an array, not about nodes and edges in a graph structure.

Need to solve for kth smallest/largest?

  • No: We're not looking for a specific kth element; we need to generate all possible subsets.

Involves Linked Lists?

  • No: The problem works with an array of integers, not linked list nodes.

Does the problem have small constraints?

  • Yes: Generating all subsets means we need to explore 2^n possibilities. This is typically feasible only for small values of n (usually n ≤ 20). The problem involves generating all possible combinations of including/excluding elements.

Brute force / Backtracking?

  • Yes: We need to systematically explore all possibilities - for each element, we decide whether to include it in the current subset or not. This is a classic backtracking pattern where we:
    1. Make a choice (include the element)
    2. Explore further possibilities (recurse)
    3. Undo the choice (backtrack by removing the element)
    4. Make an alternative choice (skip the element and its duplicates)

Conclusion: The flowchart correctly leads us to use a Backtracking approach. The problem requires generating all possible subsets while handling duplicates, which is perfectly suited for backtracking where we can make decisions at each step and backtrack when needed. The sorting step helps us identify and skip duplicate elements efficiently during the backtracking process.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When generating subsets, we face a binary decision for each element: include it or exclude it. This naturally leads to a tree-like exploration where each path from root to leaf represents a unique subset.

The challenge with duplicate elements is that they can create identical subsets through different paths. For instance, with [1,2,2], if we treat the two 2's as distinct, we might generate [1,2] twice - once using the first 2 and once using the second 2.

The key insight is that we should treat consecutive duplicate elements as a group. When we decide not to include a duplicate element, we should skip all its duplicates at once. This prevents us from creating the same subset multiple times.

Here's how we arrive at this approach:

  1. Sort the array first - This groups duplicate elements together, making them easy to identify and handle as a unit.

  2. Use a decision tree model - At each position, we branch into two possibilities:

    • Include the current element and explore further
    • Exclude the current element and explore further
  3. The crucial optimization for duplicates - When we choose NOT to include an element, we don't just skip that single element. Instead, we skip all consecutive duplicates of it. Why? Because if we've decided not to include a '2' at this decision point, including any subsequent '2' would create a subset we've already generated through a different path.

  4. The backtracking pattern emerges naturally - We build subsets incrementally, adding elements as we go deeper into recursion, then removing them (backtracking) to explore alternative paths.

Think of it like this: if you have three 2's in your array and you decide not to include a 2 at a certain decision point, you must skip all remaining 2's at that level. Otherwise, you'd be making the same subset through different combinations of which specific 2's you include, leading to duplicates in your result.

Learn more about Backtracking patterns.

Solution Approach

Following our intuition, let's implement the backtracking solution step by step:

1. Sort the input array

nums.sort()

Sorting groups duplicate elements together, making it easy to identify and skip them when needed.

2. Set up tracking variables

  • ans = [] - stores all the valid subsets
  • t = [] - temporary list that builds the current subset as we explore
  • dfs(i) - recursive function that explores subsets starting from index i

3. Implement the DFS backtracking logic

The dfs(i) function works as follows:

Base case: When i == len(nums), we've made decisions for all elements. Add a copy of the current subset t to our answer:

if i == len(nums):
    ans.append(t[:])  # Use t[:] to create a copy
    return

Recursive exploration with two branches:

Branch 1 - Include the current element:

t.append(nums[i])  # Add current element to subset
dfs(i + 1)         # Explore with this element included

Branch 2 - Exclude the current element and its duplicates:

x = t.pop()  # Remove the element we just added (backtrack)
while i + 1 < len(nums) and nums[i + 1] == x:
    i += 1   # Skip all duplicates of the current element
dfs(i + 1)   # Explore without this element and its duplicates

4. The critical duplicate-handling mechanism

The key innovation is in the second branch. After deciding not to include nums[i], we don't just move to i+1. Instead, we skip all elements equal to nums[i]:

  • We save the popped element in variable x
  • We increment i while the next element equals x
  • This ensures we skip all duplicates in one go

5. Why this works

Consider [1,2,2,3]. When we're at the first 2 (index 1):

  • If we include it and recurse, we'll explore subsets with at least one 2
  • If we exclude it, the while loop skips to index 3, bypassing both 2's entirely
  • This prevents generating [1,3] twice (once by skipping the first 2, once by skipping the second 2)

The algorithm explores a decision tree where each level represents a position in the sorted array, and the branching factor is effectively 2 (include or skip group). The time complexity is O(n × 2^n) as we generate up to 2^n subsets, each taking O(n) time to copy into the result.

Ready to land your dream job?

Unlock your dream job with a 3-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through the algorithm with nums = [1,2,2]:

Step 1: Sort the array

  • nums = [1,2,2] (already sorted)

Step 2: Initialize variables

  • ans = [] (will store all subsets)
  • t = [] (current subset being built)
  • Start with dfs(0)

Step 3: Trace through the DFS exploration

dfs(0): i=0, t=[]
├─ Include nums[0]=1: t=[1]
│  └─ dfs(1): i=1, t=[1]
│     ├─ Include nums[1]=2: t=[1,2]
│     │  └─ dfs(2): i=2, t=[1,2]
│     │     ├─ Include nums[2]=2: t=[1,2,2]
│     │     │  └─ dfs(3): i=3, t=[1,2,2]
│     │     │     └─ Base case: add [1,2,2] to ans
│     │     └─ Exclude nums[2]=2: t=[1,2] (pop 2)
│     │        └─ No duplicates to skip
│     │           └─ dfs(3): i=3, t=[1,2]
│     │              └─ Base case: add [1,2] to ans
│     └─ Exclude nums[1]=2: t=[1] (pop 2)
│        └─ Skip duplicate: nums[2]=2, so i moves to 2
│           └─ dfs(3): i=3, t=[1]
│              └─ Base case: add [1] to ans
└─ Exclude nums[0]=1: t=[] (pop 1)
   └─ No duplicates to skip
      └─ dfs(1): i=1, t=[]
         ├─ Include nums[1]=2: t=[2]
         │  └─ dfs(2): i=2, t=[2]
         │     ├─ Include nums[2]=2: t=[2,2]
         │     │  └─ dfs(3): i=3, t=[2,2]
         │     │     └─ Base case: add [2,2] to ans
         │     └─ Exclude nums[2]=2: t=[2] (pop 2)
         │        └─ No duplicates to skip
         │           └─ dfs(3): i=3, t=[2]
         │              └─ Base case: add [2] to ans
         └─ Exclude nums[1]=2: t=[] (pop 2)
            └─ Skip duplicate: nums[2]=2, so i moves to 2
               └─ dfs(3): i=3, t=[]
                  └─ Base case: add [] to ans

Key observations:

  1. When we exclude the first 2 at index 1, we skip to index 3 (past both 2's)
  2. This prevents generating duplicate subsets like having two [1] or two []
  3. Final result: ans = [[1,2,2], [1,2], [1], [2,2], [2], []]

The duplicate-skipping mechanism is crucial: after excluding a 2, we skip ALL consecutive 2's to avoid creating the same subset through different paths.

Solution Implementation

1class Solution:
2    def subsetsWithDup(self, nums: List[int]) -> List[List[int]]:
3        def backtrack(index: int):
4            # Base case: if we've processed all elements
5            if index == len(nums):
6                result.append(current_subset[:])  # Add a copy of current subset
7                return
8          
9            # Choice 1: Include the current element
10            current_subset.append(nums[index])
11            backtrack(index + 1)
12          
13            # Backtrack: remove the element we just added
14            removed_element = current_subset.pop()
15          
16            # Skip all duplicates of the removed element
17            # This ensures we don't create duplicate subsets
18            while index + 1 < len(nums) and nums[index + 1] == removed_element:
19                index += 1
20          
21            # Choice 2: Exclude the current element (and all its duplicates)
22            backtrack(index + 1)
23      
24        # Sort the array to group duplicates together
25        nums.sort()
26      
27        # Initialize result list and current subset being built
28        result = []
29        current_subset = []
30      
31        # Start the backtracking process from index 0
32        backtrack(0)
33      
34        return result
35
1class Solution {
2    // Store all unique subsets
3    private List<List<Integer>> result = new ArrayList<>();
4    // Store current subset being built
5    private List<Integer> currentSubset = new ArrayList<>();
6    // Input array
7    private int[] nums;
8
9    /**
10     * Find all possible subsets of the given array, excluding duplicates.
11     * @param nums - input array that may contain duplicate elements
12     * @return list of all unique subsets
13     */
14    public List<List<Integer>> subsetsWithDup(int[] nums) {
15        // Sort array to group duplicate elements together
16        Arrays.sort(nums);
17        this.nums = nums;
18        // Start backtracking from index 0
19        backtrack(0);
20        return result;
21    }
22
23    /**
24     * Backtracking helper method to generate all unique subsets.
25     * @param index - current position in the nums array
26     */
27    private void backtrack(int index) {
28        // Base case: reached end of array, add current subset to result
29        if (index >= nums.length) {
30            result.add(new ArrayList<>(currentSubset));
31            return;
32        }
33      
34        // Include current element in subset
35        currentSubset.add(nums[index]);
36        backtrack(index + 1);
37      
38        // Backtrack: remove the element we just added
39        int removedElement = currentSubset.remove(currentSubset.size() - 1);
40      
41        // Skip all duplicate occurrences of the removed element
42        // This ensures we don't create duplicate subsets
43        while (index + 1 < nums.length && nums[index + 1] == removedElement) {
44            index++;
45        }
46      
47        // Explore subsets without including the current element (and its duplicates)
48        backtrack(index + 1);
49    }
50}
51
1class Solution {
2public:
3    vector<vector<int>> subsetsWithDup(vector<int>& nums) {
4        // Sort the array to group duplicate elements together
5        sort(nums.begin(), nums.end());
6      
7        vector<vector<int>> result;
8        vector<int> currentSubset;
9        int n = nums.size();
10      
11        // Define recursive function to generate all subsets
12        function<void(int)> backtrack = [&](int index) {
13            // Base case: if we've processed all elements, add current subset to result
14            if (index >= n) {
15                result.push_back(currentSubset);
16                return;
17            }
18          
19            // Choice 1: Include the current element in the subset
20            currentSubset.push_back(nums[index]);
21            backtrack(index + 1);
22            currentSubset.pop_back();
23          
24            // Skip all duplicate elements to avoid generating duplicate subsets
25            // This ensures we only process each unique element value once at this decision point
26            while (index + 1 < n && nums[index + 1] == nums[index]) {
27                index++;
28            }
29          
30            // Choice 2: Exclude the current element (and all its duplicates) from the subset
31            backtrack(index + 1);
32        };
33      
34        // Start the backtracking process from index 0
35        backtrack(0);
36      
37        return result;
38    }
39};
40
1/**
2 * Generates all possible subsets of an array containing duplicates
3 * @param nums - Input array that may contain duplicate elements
4 * @returns Array of all possible subsets without duplicate subsets
5 */
6function subsetsWithDup(nums: number[]): number[][] {
7    // Sort the array to group duplicate elements together
8    nums.sort((a, b) => a - b);
9  
10    const arrayLength: number = nums.length;
11    const currentSubset: number[] = [];
12    const allSubsets: number[][] = [];
13  
14    /**
15     * Depth-first search to generate all subsets
16     * @param currentIndex - Current position in the nums array
17     */
18    const generateSubsets = (currentIndex: number): void => {
19        // Base case: reached the end of the array
20        if (currentIndex >= arrayLength) {
21            // Add a copy of the current subset to the result
22            allSubsets.push([...currentSubset]);
23            return;
24        }
25      
26        // Choice 1: Include the current element in the subset
27        currentSubset.push(nums[currentIndex]);
28        generateSubsets(currentIndex + 1);
29        currentSubset.pop();
30      
31        // Skip all duplicate elements to avoid generating duplicate subsets
32        // This ensures we only process each unique element value once at this decision point
33        while (currentIndex + 1 < arrayLength && nums[currentIndex] === nums[currentIndex + 1]) {
34            currentIndex++;
35        }
36      
37        // Choice 2: Exclude the current element (and all its duplicates) from the subset
38        generateSubsets(currentIndex + 1);
39    };
40  
41    // Start the DFS from index 0
42    generateSubsets(0);
43  
44    return allSubsets;
45}
46

Time and Space Complexity

Time Complexity: O(n × 2^n)

The algorithm generates all possible subsets of the input array while handling duplicates. After sorting the array in O(n log n) time, the DFS explores a decision tree where at each element we decide whether to include it or not. For arrays with unique elements, this would create exactly 2^n subsets. Even with duplicates, the algorithm still visits each valid subset exactly once by skipping consecutive duplicate elements when backtracking. Each subset requires O(n) time to copy into the result list (via t[:]). Therefore, the total time complexity is dominated by O(n × 2^n) for generating and copying all subsets.

Space Complexity: O(n)

The space complexity considers only the auxiliary space used by the algorithm, excluding the output. The recursive call stack can go as deep as n levels (one for each element in the array). The temporary list t can hold at most n elements at any point. These two factors contribute O(n) space each. Since they don't accumulate independently (the maximum size of t corresponds to the deepest recursion level), the total auxiliary space complexity is O(n).

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Duplicate Skipping Logic

The Pitfall: A common mistake is trying to skip duplicates using a condition like if i > 0 and nums[i] == nums[i-1]: continue inside the recursive function. This approach fails because it doesn't distinguish between:

  • Using the first occurrence of a duplicate element
  • Skipping a duplicate after already deciding not to use the previous occurrence

Incorrect Implementation:

def backtrack(index):
    if index == len(nums):
        result.append(current_subset[:])
        return
  
    # WRONG: This skips duplicates unconditionally
    if index > 0 and nums[index] == nums[index - 1]:
        backtrack(index + 1)
        return
  
    # Include current element
    current_subset.append(nums[index])
    backtrack(index + 1)
    current_subset.pop()
  
    # Exclude current element
    backtrack(index + 1)

This fails for [1,2,2] because it would skip the second 2 entirely, missing subsets like [2,2] and [1,2,2].

The Solution: The correct approach only skips duplicates in the "exclude" branch, after we've already explored including the element. The key insight is that we explore all ways to include duplicates first, then skip them all at once when excluding.

2. Forgetting to Sort the Array

The Pitfall: Without sorting, duplicate elements might be scattered throughout the array, making it impossible to identify and skip duplicate groups efficiently.

Example of Failure: For nums = [2,1,2] without sorting:

  • We'd generate [2,1] and [1,2] as different subsets (they're the same set)
  • The duplicate-skipping logic wouldn't work since duplicates aren't adjacent

The Solution: Always sort the array first: nums.sort(). This groups identical elements together, enabling the duplicate-skipping mechanism to work correctly.

3. Modifying the Index Variable Incorrectly

The Pitfall: In Python, when you modify a parameter inside a function, it only affects the local scope. Some might write:

# WRONG: This doesn't affect the recursive call properly
while index + 1 < len(nums) and nums[index + 1] == removed_element:
    index += 1
backtrack(index + 1)  # This uses the modified index

While this actually works in the given solution, a cleaner approach would be:

The Solution: Calculate the next valid index and pass it directly:

# Calculate next index after skipping duplicates
next_index = index + 1
while next_index < len(nums) and nums[next_index] == removed_element:
    next_index += 1
backtrack(next_index)

4. Not Creating a Copy of the Current Subset

The Pitfall: Adding the current subset directly to the result without copying:

# WRONG: This adds a reference to the same list object
result.append(current_subset)  # All entries will point to the same list

Since current_subset is modified throughout the recursion, all entries in result would end up pointing to the same list object, which would be empty at the end.

The Solution: Always create a copy when adding to the result:

result.append(current_subset[:])  # Creates a new list with current elements
# or
result.append(list(current_subset))  # Also creates a copy
Discover Your Strengths and Weaknesses: Take Our 3-Minute Quiz to Tailor Your Study Plan:

Problem: Given a list of tasks and a list of requirements, compute a sequence of tasks that can be performed, such that we complete every task once while satisfying all the requirements.

Which of the following method should we use to solve this problem?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More